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Preface 

This beg inni ng textbook in statistical methods has been written 
to meet the needs of undergraduate college students who are 
concentrating in sociology and related subjects. In the choice 
of methods, in the character of the illustrative data and problems, 
and in emphasis throughout, it differs from the texts in eco- 
nomic or educational statistics that have generally been used 
by such students. 

The chief purpose has been to provide students who expect to 
become professional sociologists with the necessary groimdwork 
for more advanced training in quantitative research methods. 
Familiarity with the topics included, however, should enable 
those who take no further courses in statistics to understand 
most of the statistical studies and references that now appear 
in the sociological journals and literature. Nonprofessional 
students who go through the course should learn to appreciate 
some of the difficulties involved in the study of social problems, 
and to be more wary of careless and prejudiced thinking in this 
field; for mathematical statistics represents a rigorous form of 
applied logic. 

Unfortunately, most students who elect to specialize in soci- 
ology have no mathematical training beyond high school algebra. 
This fact has compelled the omission of mathematical deriva- 
tions, with the exception of a few very simple ones. As a 
substitute, an attempt has been made to point out assumptions 
that should be watched in using the various formulas. Students 
who plan to go on in the subject, however, should begin at once 
to build up an adequate mathematical background. 

Because of its complications and as yet very infrequent use in 
sociological research, small-sampling theory has for the most part 
been omitted from this elementary treatment. 

The amount of material covered is more than enough for a 
semester^s work with an average class, so that some selection of 
topics is possible for the instructor. Under certain circum- 
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stances, it may be advisable to omit the less easy sections of 
chapters IX, XI, XII, XIII, and XIV. 

Constant practice in working statistical problems is indispensa- 
ble for mastery of the subject. The problems given at the end 
of each chapter are intended to be only suggestive, they should 
be greatly multiplied for laboratory purposes. 

Thanks are due Professor E. A Gaumnitz of the University 
of Wisconsin, who has read the manuscript and made helpful 
suggestions, and Mr. Robert J. Hader, who has eliminated 
numerous minor errors. 

Special acknowledgment is made of permission by Prof. R. A. 
Fisher and his publishers, Oliver & Boyd, Edmburgh, to use the 
Table of Chi-square and the Table of Values of the Correlation 
Coefficient for Different Levels of Significance, which appear as 
Tables 2 and 4 in the Appendix of this book. Many other 
publishers have been kind enough to grant permission to use 
tables and material, specific acknowledgment of which has been 
made in place. 

Thomas C. McCormick. 

Mju)ison, Wis , 

Augusty 1941. 
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CHAPTER I 


INTRODUCTORY 

1. The Origins of Statistics. — The word statistics was used in 
Great Bntain and Stahshk in Germany as early as the eighteenth 
century to refer to collections of information of any kind about a 
state (siaie-istics) As time passed, statistics^' came to be 
hmited to quantitative data or figures on wealth, taxes, marriages, 
baptisms, deaths, and the hke. Distinguished pioneers in the 
field were the Germans, Achenwall and Busching. Modern 
agencies representing this type of statistics are the census bureaus 
of the Umted States and other nations. 

Mathematical statistics, a branch of mathematical theory, 
originated in investigations of birth and death rates and in 
efforts to solve problems growing out of games of chance. Among 
the great early vital statisticians were Graunt and Petty of 
England and Sussmilch of Germany. The fundamentals of 
the theory of probability were developed from the seventeenth 
to the mneteenth century by such eminent mathematicians as 
Pascal, Bernoulli, de Moivre, Laplace, and Gauss. 

Elementary mathematical statistics was popularized in the 
nineteenth century by the Belgian, Quetelet, who apphed it to a 
wide variety of topics, including physical anthropology and 
crime. He is sometimes called the father of social statistics as 
the extension of statistics to sociological problems may be 
termed. 

A rapid expansion in mathematical statistics and its use in 
science occurred in England during the first quarter of the 
present century through the work of Karl Pearson, following 
earlier efforts by Sir Francis Galton. These two men were 
biologists, and Pearson was a mathematician as well. As a 
result of tins phase, modern statistical methods bear the imprint 
of adaptation to biological data 

Mathematical statistics has gradually become a major method 
of research in the fields of agriculture, biology, educational 
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psychology, psychology, geography, and physical anthropology. 
Among the social sciences, education, economics, and social 
psychology at present lead m the proportion of statistical studies 
pubhshed, with sociology fourth and political science fifth. 
Statistical analysis is still rare in cultural anthropology and 
history. In a different direction, statistics is finding application 
in mathematical physics, engineering, and medicine. 

In this book, we shall be interested in elementary statistical 
methods only as tools of investigation in sociology and related 
social sciences. 

2. Quality and Quantity. — qualitative difference implies a 
difference in nature, such as we recognize between a fanuly and a 
church. A quantitative difference refers to a variation in 
amount between two or more instances of the same quality: for 
example, an intelligence quotient (I.Q ) of 112 is 14 units greater 
than an intelligence quotient of 98. 

Different qualities must be compared in terms of common 
subqualities (common denominators), as a Presbyterian fanuly 
and a Methodist church, a family of five members and a church 
of 500 members. A pure quality, moreover, can vary only in 
amount. It follows that all comparison must consist in noting 
what qualities are and are not common to A and 5, and how 
each common quality varies in amount from A to B. The city 
and the country may be compared in terms of common qualities 
like population density, birth rate, death rate, incidence of 
tuberculosis, intelligence quotients, honesty, and so on; but in 
each instance the difference must be in terms of amount. Thus 
the birth rate of the city is 15 per 1,000, that of the country is 
22 per 1,000; and country people are believed to be more honest 
than city people. The last judgment is no less quantitative in 
nature because it is impressionistic and rough. 

Because comparison is basic to knowledge, and quantitative 
judgments are inseparable from comparison, quantitative judg- 
ments are unavoidable in science. It is thus easy to understand 
why scientists have gradually developed more and more system- 
atic and reliable ways of making quantitative judgments, such 
as we have in the many branches of mathematics, including 
mathematical statistics. 

3. Statistics, the Method of Probabilities. — The questions in 
which social scientists are interested do not have exact or certain 
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quantitative answers. For example, if we ask wh.at is the 
relation between the occurrence of divorce and the presence of 
children in the home, we find that divorce occurs both among 
couples with children and among couples without children, 
but relatively more often in the case of the latter. We cannot 
say that divorce takes place only when there are no children, but 
we can say that divorce is reported in so many childless marriages 
per 1,000, and in so many fertile marriages per 1,000. Or, 
expressing it a httle differently, we can say that the chances of 
divorce are X in 1,000 in the case of a childless couple, and Y 
in 1,000 in the case of a fertile couple. 

Statistical methods are specially designed for the analysis 
of quantitative^ data hke those above that result from many 
causes, some or all of which cannot be completely controlled. 
Outside the scientific laboratory, and even in much laboratory 
research, adequate control over all factors is out of the question. 
For this reason, the statistical method has general apphcation. 

Mathematical statistics is a direct logical extension to practical 
situations of the exact quantitative methods used in the labora- 
tory experiments of the physical sciences. When precise 
measurement and complete control over all factors are possible, 
a mathematical equation can be set up from which the value of 
a dependent factor, F, can be estimated exactly for any given 
value of an independent factor, X. For example, if we know 
the distance, X, of an object from the ground, we can calculate 
from the law of faUing bodies the time, Y, it will take for the 
object to fall in a vacuum. When we actually drop an object 
under these controlled conditions, a stop watch will always 
register the length of time predicted by the equation. The 
likehhood that the period observed in any competent repetition 
of the experiment will be that computed from the equation is 
certainty. 

If, however, the object happens to be a feather which is 
dropped under ordinary atmospheric conditions rather than 
in a vacuum, the situation is different. In proportion as the 
factors are uncontrolled or unknown, the stop watch will no 
longer register the time predicted by the law of falling bodies. 
Nevertheless, if a large number of experiments are made by 

^ By quantitative data are meant data that can be measured or counted, 
as discussed below m Chap. II. 
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dropping the feather under ordinary atmospheric conditions 
from the same distance, the time of falhng will be found to vary 
around some average time, being sometimes more and sometimes 
less. Similarly, the average time of falhng from other distances, 
X, can be found, and an equation worked out from which the 
average time of falling, F, can be estimated for any distance, X. 
Then by studying the vaiying time required for the object to 
fall a given distance, it may be estabhshed that in say two-thirds 
of the trials the time does not vary from the average time, say 
2 sec., by more than say 0.1 sec. This enables us to make a 
prediction from our empirical equation. We can say that if our 
feather is dropped under ordinary atmospheric conditions from 
a given height, the time required to fall will, two out of three 
times, in the long run, vary from an estimated average of 2 sec. 
by not more than 0.1 sec. m either direction. That is, in two 
out of three trials, the time of falling will be between 1.9 and 
2 1 sec. 

This is, broadly speaking, the kind of estimate that mathe- 
matical statistics furmshes m the social sciences. In essence it is 
always a calculation of probabilities. The ^^pure’’ mathematical 
formula of the laboratory is merely a special case of the statistical 
equation, being the limit that the latter approaches as the amount 
of control and precision of measurement are increased. If 
sociological data could be exactly controlled and measured, the 
element of probabihty would disappear, and the statistical 
equation would become a precise one hke the law of falling 
bodies. 

4. Representative Data. — Most sociological studies, statistical 
or otherwise, deal with samples rather than with complete data. 
If farm life in a given state is to be investigated, certain farms 
are taken as a sample to represent all the farms in the state 
regarded as the universe. The essential requirements of a good 
sample are that every item in the universe from which the sample 
is drawn shall have an equal chance of being included in the 
sample, and the sample must be large enough to include every 
kind of item in the universe in something hke the correct propor- 
tions. The proper size of the sample depends somewhat on how 
much the items in the universe vary among themselves. Poor 
samples that include items from outside the universe they are 
intended to represent, that omit important elements of the 
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universe, or tliat include elements of the universe in the wrong 
proportions, are a fertile source of false conclusions in social 
research. A large part of mathematical statistics deals with 
the problems of samphng. 

6. Statistics and the Individual. — ^It is commonly thought that 
statistics cannot deal with the individual, but must confine itself 
to group averages. There is really nothing to prevent a statis- 
tical investigation of an individual. An individual may be 
readily analyzed into factors or units of various kinds, and the 
relationships of these to other factors in the same personality 
and in the environment can be studied by the same methods 
that are now used in studying groups of individuals. As a 
matter of economy, however, society will seldom want to subject 
individuals to scientific study except as types, which, of course, 
lead back to group averages. 

6. Interpretation of Statistical Results. — Statistics employs 
figures and mathematical symbols that represent defimte factors 
in a particular problem. In interpreting statistical results, 
therefore, care must be taken that each symbol is given the 
same meaning that was assigned to it at the beginning of the 
problem, and to which no important exceptions were allowed 
during the study. 

It IS sometimes puzzling to understand the reasons for a 
statistical fact, and offhand explanations may be found at the 
end of even careful studies. But if the original study was not 
sufficiently inclusive to clarify some point of interest, its reliable 
explanation can consistently come only from further research. 
For example, if an investigation discovers that a larger proportion 
of women are married in cities where the number of men exceeds 
the number of women than in cities where the two sexes are 
equal in number, or where women outnumber men, one may 
speculate that this is because men do the proposing. It should be 
made clear, however, that such an explanation is only a plausible 
‘'hunch,” which should be tested if it is considered of enough 
importance. 

Difficulty may also be experienced in interpreting just what 
certain statistical concepts mean, e g , correlation coefficients, 
averages, or tests of statistical sigmficance. The only help 
here is a clearer understanding of statistical methods, and 
especially of the mathematical assumptions that underlie them. 



8 


ELEMENTARY SOCIAL STATISTICS 


7 . Statistics Not a Mechanical Method. — ^Although the sta- 
tistical method allows data to be treated by systematic and 
standardized techniques, it is a serious mistake to suppose that 
it is a mechamcal method that may be substituted for hard and 
original thinking. On the contrary, mathematical statistics is 
merely a set of powerful logical* tools that call for a high type of 
judgment and skill for their successful use The statistical 
investigator must know what techniques are vahd and effective 
for a given problem, and when quantitative methods are not 
appropriate at all He needs insight to select worthwhile 
problems, and intimate knowledge of the data to interpret his 
findmgs, no less than does any other type of investigator. 

8. Simplicity the Ideal. — The experienced statistician always 
prefers simple to complex methods, when the two are equally 
effective. The beginner will do well not to yield to the tempta- 
tion to depart from this sensible rule. 

Exercises 

1. Briefly summarize the history of statistics and the extent of its 
use as a method of research 

2. Distingmsh between quahty and quantity Illustrate. 

3. Can you find an exception to the proposition that all comparison 
is quantitative? 

4. To what general kind of research situation is statistics appropriate, 
and why? Illustrate. 

6 . What is the relationship of the statistical equation to the mathe- 
matical ^Taw^^ of physics? 

6 . a. How exactly can predictions be made by means of statistical 
methods? 

6. How serious a handicap does this impose on social research? 

7. What IS the likehhood in the field of social research that the statis- 
tical method will some day be replaced by exact mathematical formulas 
hke those of physics? Explain. 

8. Comment briefly on the following pubhshed statement. 

^‘Jobless Survey to Bare Truths C. C. Head Says Pres George Davis 

of the Chamber of Commerce of the United States said Saturday an 
impartial survey of the employable jobless would show their numbers 
had been exaggerated and disprove alleged needs for spreading work 
by reducing working hours. 

“He said the chamber recently employed a statistical agency to 
make a sample survey of 100 rehef recipients in a representative city 
of more than 100,000 population. The names of 50 men and 50 women 
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were picked at random from Federal and local governmental relief rolls 
in the city 

''The survey showed, he said, that 44 out of the 100 never had been 
employed in private business. Seventeen were over 70 years and 82 
never had a bank or savings account 

"He says the figures point out that the greater number of those 
labeled as unemployed could not or would not work in private industry 
even if jobs were available ” 

9 . Give an example of representative and unrepresentative, adequate 
and inadequate sampling that might occur, or has occurred, in social 
research. 
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CHAPTER II 


THE QUANTIFICATION OF SOCIAL DATA 

1 . Definition and Counting. — The methods of statistics are 
apphcable only to data that can be expressed in some kind of 
countable units. Any event or quahty that can be recognized 
can be counted. If we know a happy marriage when we see one, 
we can count the happy marriages in a sample of marriages. 
Nothing simpler can be done to a concept than to count how 
many times instances of it occur. If the concept is not suffi- 
ciently recognizable for its instances to be counted, one may 
fairly assume that it is not yet ready for any kind of scientific 
manipulation, except attempts to arrive at a more reliable 
defimtion. 

2 . Classification. — If a concept (eg , ^'conflict behavior ^0 can 
be broken down into two or more subcategories (e g , '^war/' 

revolution,” etc ) that can be defined well enough to be told 
apart, its cases can be classified Classification makes possible 
the counting of instances in each class, which may then serve as a 
basis for considerable statistical analysis. We have simple 
classification whenever data are sorted into categories that are 
entirely unordered with respect to amount. For example, we 
may classify our acquaintances as religious and nonrehgious; 
we may classify Americans as native white of native parentage, 
native white of foreign parentage, foreign born, and so on. 
Data may also be classified with respect to two or more criteria 
at a time, as married couples by occupation of husband, by 
income, and by number of children. The points to watch in 
classification are careful, objective definition of the several 
categories in terms of criteria that can be recognized in the 
instances to be classified, and independent reclassification of 
the instances by other competent investigators, to determine 
the rehability of the classification Logically, any classification 
should be based on the same criterion throughout. Thus, 
it would not do to classify some of the foreign born as Catholics 
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or Protestants, and the rest as Itahans, Jews, Germans, and so on. 
Also, any classification should be totally inclusive of the class 
defined and exclusive of all other classes.^ That is, if we are 
deahng with all the foreign born in the United States, the 
Rumanians should not be omitted, nor should the American 
Indians be included. 

3. Measurement of Amount, — The fact that any quality, such 
as happiness in marriage, varies in degree, sooner or later forces 
the sociologist to go beyond the mere counting of instances, and 
to attempt to measure the intensity of the quahty m a given 
instance or set of instances. For example, we may score the 
answers of married couples to a questionnaire and may regard 
the score of any couple as an index of the amount of happiness 
that they derive from their relationship. The central problem 
is, again, to find a unit in terms of which at least the relative 
amount of the quality can be measured. This is seldom easy to 
do, and must usually be approached through the devices of 
ranking, rating, or scoring. 

4. Ra nk i n g. — Rankmg, or the arrangement of the instances 
of a quahty in order of amount, has been called the most ele- 
mentary form of measurement. We consider person A more 
cooperative than person B, B more cooperative than C, and so on. 
To increase the reliability of these judgments, the ranking may 
be done independently by several quahfied judges, and the 
average ranks taken. Greater accuracy is sometimes obtained 
by ranking each item with respect to every other item, ^.c,, by 
all possible pairs. Where quahfied and careful judges cannot be 
obtained, ranking should not be used. As soon as the instances 
of a quahty are ranked, they become capable of a fair amount of 
statistical treatment, including rank correlation. ^ 

5. Rating. — Similar to ranking is rating, or the classification 
of items into ascending, or ordered, classes. There are usually 
three to seven of these classes. An odd number allows for a 
median class, which is desirable. Thus psychiatrists may rate 
persons in terms of their intelligence as Mentally Defective, 
Slow-dull, Slow, Average, Fairly Intelligent, Distinctly Capable, 

^ See “classification^^ in any text in logic, eg jEt A. Burtt, Principles and 
Problems of Bight Thinking, pp. 162-164, Harper & Brothers, New York, 
1928 

2 See Chap X, Sec 8. 
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and Very Able. Classification of instances into categories like 
these should be done independently by two or more persons as a 
check. If there is good agreement in the placing of individual 
instances, the percentages of the instances put in each given 
category by several judges may then be averaged to improve the 
accuracy. Self-ratings may be used, as well as ratings by others. 

6. Scoring. — In the case of most score cards, the experimenter 
decides impressiomstically what subscore, usually a percentage, 
should be given to each aspect of a variable (e.g,, the socio- 
economic status of a home). In other cases, a subscore is 
determined by counting the number of a certain item present 
in each instance (e g , books in a home), or by measurement in 
the stricter sense annual family income in dollars). The 
total score is the sum of the subscores on the different items 
included in the card. Usually the equality of the units, the 
placing of the zero point, the weightings, and the meamng of the 
total score are open to question; but in any case the total score 
represents a senes of accumulated judgments reduced to a 
numerical common denominator. Scoring devices may be quite 
elaborate, as may be seen by inspecting Chapin^s living room 
‘^scale^^ for scoring the socioeconomic status of a home, the 
Stanford-Binet intelhgence test, or score cards for, say, dairy 
cattle used in judging contests at livestock shows. To show 
that they are parts of the same or associated things, the score 
on each item included on a score card should as a rule be high 
when the total score on the card is high, low when the latter is 
low. The theory of the score card is that the total score is an 
index or function of (vanes with) the amount of the quality 
it is attempting to measure. Part of a living room score card 
designed by F. Stuart Chapin to measure the socioeconomic 
status of American homes is reproduced below. 

Chapin’s Scale for Rating Living Room Equipment^ 

niRECTIONS TO VISITOR 

1. The following list of items is for the gmdance of the recorder. Not 
all of the features listed will be found in any one home. Entries on the 
schedules should, however, follow the order and numbering indicated. 

1 F Stuabt Chapin, Scale for Rating Living Room Equipment, American 
Journal of Sociology, Vol 37, pp. 583, 584, 1932. 
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Weights appear after the names of the respective items. Disregard 
these weights in recording. Only when the hst is finally checked 
should the individual items be multiplied by these weights and the 
sum of the weighted score be computed, and then only after leaving the 
home All information is confidential 

2. Check or underhne the articles or items present. If more than 
one, write 2, 3, or 4, as the case may be. 

3 Do not enter the score of *any article or feature present. Com- 
plete recording before attempting to enter scores. 

4. In cases where the family has no real hving room, but uses the 
room at rdghts as a bedroom, or during the day as a kitchen or as a 
dimng room, or as both, in addition to use of room as the chief gathering 
place of the family, please note this fact clearly and descnbe for what 
purposes the room is used. 

5 When possible, it is desirable to have a hving room checked twice. 
This may be done in either of two ways. 

а. After an interval of two or three weeks the same visitor may 
recheck the room. The first schedule should be marked I, the 
second II. 

б. After an interval or simultaneously, the room may be checked by 
two different visitors. One schedule should be marked A, the 
other J5. 

Scores of the same homes on two trials should be similar. If a group 
of homes are scored twice there should be a high correlation between 
the scores. Please report findings to F Stuart Chapin, Umversity of 
Minnesota. 


Schedule of Living Room Equipment 


I Fixed Features 
1 Floor 

Softwood 1, hard- 
wood 2, composi- 
tion 3, stone 4 
2. Floor covermg 
Composition 1, car- 
pet 2, small rugs 3, 
large rug 4, Orien- 
tal rug 6. 

3 Wall covermg 
Paper 1, calcimme 
2, plain pamt 3, 
decorative paint 4, 
wooden panels 5 


4. Woodwork 

Painted 1, var- 
nished 2, stamed 3, 
oiled 4 

5. Door protection. . . 
Screen 1, storm 
door 1. 

6. Wmdows 

1 each window 

7. Window protection^ 
Screen, bhnd, net- 
tmg, storm sash, 
awnmg, shutter, 1 
each. 


^ If checked out of season, ascertain if used in season and so record. 
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Schedule of Living Room Equipment — {Continued) 


8 

Window covering^ 

III Standard Furniture 


Shades 1, curtains 

20 

Table 


2, drapes 3 


Sewing 1, writing 1, 

9. 

Fireplace 



card 1, hbrary, end, 


Imitation 1, gas 2, 


tea, 2 each 


wood 4, coal 4. 

21 

Chair 

10 

Fire utensils 



Straight, rocker, 


Andirons, screen, 


arm-chair, high 


poker, tongs, shov- 


chair, 1 each 


el, brush, hod, bas- 

22 

Stool or bench 


ket, rack, 1 each 


High stool, foot- 

11. 

Heat 

— 

stool, piano stool, 


Stove 1, hot air 2, 


piano bench, 1 


steam 3, hot water 

yf 


each 

12 

4 

Artificial hght 

23 

Couch . . 


Kerosene 1, gas 2, 


Cot 1, sanitary 


electric 3 


couch 2, chaise 

13 

Artificial ventila- 


longue 3, daybed 4, 


tors 1 



davenport 5, bed- 

14 

Clothes closets 1 



davenport 6 


Total section I 

— 24 

Desk 

Built-in Features 


Business 1, per- 

15 

Book containers . 

— 

sonal-social 2 


Shelves 1, cases 2 

25 

Bookcases 1 

16 

Beds 

“ 26 

Wardrobe or mov- 


In a sideboard 1, in 


able cabinet 1 


a ceiling 2, m a 

HnnT S 

27 

Sewing cabinet 1 

17 

Desk 1 

_ 28 

Sewmg machine 

18 

Window seats 1 


Hand power 1, foot 

19 

Wmdow boxes 1 



power 2, electric 3 


Total section II 

Etc , etc 


7. The Scale —The ideal measuring device is the scale By 
a scale is meant a sequence of interchangeable external units 
numbered from zero, such as a straightedge marked off into feet 
and inches In sociology and psychology most attempts to 
develop scales have started from ranks or ratings One of the 
simplest devices is the so-called graphic rating scale. The 
following is an example. 

0 25 50 75 100 

I I a- ^ * — ■ — 

Completelv Submissive Average Dominating Completely 
submissive domma!inc| 

Fig 1. — A simple graphic ratmg scale 
checked out of season, ascertain if used m season and so record 
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Each judge rates each subject on a separate scale by making a 
mark on the scale where he thinks the subject falls. The 
distance of the mark from Completely submissive’^ taken as 
zero is then measured in umts of the spatial scale. The final 
rating of each subject is the average of the ratings given him 
by the several judges, provided there is a tendency toward 
agreement among them. The scale may become more objective, 
however, if a subject is scored, say, “80 per cent dominating” 
because he is observed to dominate (as tangibly defined) in 
80 per cent of his contacts. This assumes that one contact is 
equal to another for the purpose in hand; but weighting may be 
apphed if needed. Evidently, this kind of scale cannot claim 
the precision of scales in the physical sciences; but it is capable 
of very useful results. 

If the ordmaT numbers derived from ranking are subjected to 
arithmetical treatment, such as addition or the calculation of 
means, it is implicitly assumed that the ranked instances are 
equally spaced on a linear scale. Thus, if we rank cities in 
respect to the efl&ciency of their governments, beginning with 
the least efl&cient, so that city C is 1, city A is 2, city B is 3, 
etc , and if we then use these ordinals as cardinals in arithmetical 
calculations, we imply that the government of city A is twice as 
efficient as that of city (7, that the government of city 5 is 1,5 
times as efficient as city A^ etc. This assumption is, of course, 
inaccurate, but sometimes it is the best that can be done, or it is 
good enough for a particular problem. The zero point on such a 
scale IS arbitrarily placed, usually coincident with or one unit 
below the lowest rank. 

The most elaborate effort to build an exact scale yet made 
in the social sciences is probably that of L. L Thurstone in the 
case of his scale for the measurement of an attitude, a sample of 
which is reproduced below. ^ Generalizing on Thurstone’s 
method, and introducing minor modifications, it runs about as 
follows. A considerable number of supposed indexes of the 
attribute to be measured are chosen. Let us say that the 
attribute is “radicalism”; then the indexes might include mem- 
bership in the Socialist party, admitted statements made against 

cardinal number tells how many or how much;*aji ordinal number 
locates position m a senes 

2 L L Thtjkstone and E J Chave, The Measurement of Attitude, Univer- 
sity of Chicago Press, Chicago, 1929. 
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the existing social order, membership in a labor union, radical 
papers and journals read, expressed Commumstic sympathies, 
signature on radical petitions, participation in strikes, jail 
sentences for radical activity, the authorship of radical articles 
and books, subscriptions to this or that radical doctiine, atheism, 
unconventional sexual behavior, and so on. After these indexes 
have been selected and defined as objectively as possible, they 
are submitted to a number of quahfied judges, who are asked to 
rank them m the order of the degree of radicahsm that each 
seems to imply. Indexes that appear to indicate about the same 
degree of radicalism are regarded as ties. The indexes are thus 
collected into successive piles, which to the judges should seem 
to be equally spaced apart in degree of radicahsm. When the 
judges have finished ranking the indexes, each index is assigned 
the average rank given it by the several judges, except that any 
index about which the judges differ too much is rejected entirely. 

Index C f Index A r Index B 

V V 

LI It t 1 I 1 ULJ 1 ^ 

0 3 5 1011 IS 20 60 85 90 92 95 100 

Fig 2 — Diagram of a generalized Thuratone attitude scale 

Each index will then have an average rank or scale value, and 
these values may if desired be converted to a percentage scale, 
from the lowest value taken as zero to the highest value taken 
as 100 (see Fig. 2). The scale is then ready to be applied to 
other samples of instances (say persons), by simply checking 
on a hst of the indexes those that apply to a given individual, 
adding the scale values of the indexes checked, and averaging 
them Each individual may thus be given a scale value that 
IS supposed to measure* in a relative way the amount, say, of 
“radicalism'^ that characterizes him. 

Thurstone's attitude scale has often given results that cor- 
related highly with those obtained by much simpler procedures, 
such as graphic rating scales, and ratings^ or rankings represented 
by consecutive numbers. It has also been criticized on various 
theoretical grounds ^ 

1 For example, mdiYiduals are classified as Very Radical, Radical, Neutral, 
Conservative, Very Conservative, and those in the Very Radical group are 
given a score of one, those in the Radical group a score of two, etc. 

® See R K Merton, Fact and Factitiousness in Ethnic Opimonnaires, 
American Sociological Review, Vol. 5, pp. 13-28, 1940. 
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Sample of a Thtjestoise Attitude Scale^ 

EXPERIMENTAL STUDY OF ATTITUDE TOWARD THE CHURCH 


Check {y ) every statement below that expresses your sentiment toward 
the church. Interpret the statements m accordance with your own experi- 
ence with churches. 

Scale 

value 


1. I think the teaching of the church is altogether too superficial 

to have much social significance . . . 8 3 

2. I feel the church services give me mspiration and help me to 

live up to my best durmg the following week . 1 7 

3. I th in k the church keeps busmess and pohtics up to a higher 

standard than they would otherwise tend to mamtam 2 6 

4 I find the services of the church both restful and inspiring 2 3 

5. When I go to church I enjoy a fine ritual service and good 
music ... .40 

6 I beheve in what the church teaches but with material 
reservation 4 5 

7. I do not receive any benefit from attendmg church services 

but I think it helps some people . 5 7 

8. I believe m rehgion but I seldom go to church 5 4 

9 I am careless about rehgion and church relationships but I 

would not like to see my attitude become general 4 7 

10 I regard the church as a static, crystalhzed mstitution and as 

such it IS unwholesome and detrimental to society and the 
individual 10 5 

11 I beheve church membership is almost essential to livmg life 

at its best 1 5 

12 I do not understand the dogmas or creeds of the church but I 

find that the church helps me to be more honest and 
creditable . . . 3 1 

13. The paternal and benevolent attitude of the church is quite 

distasteful to me .... . 82 

14 I feel that church attendance is a fair index of the nation’s 

morahty ... 26 

15. Sometimes I feel that the church and rehgion are necessary 

and sometimes I doubt it ..56 

16 I beheve the church is fundamentally sound but some of its 

adherents have given it a bad name . . 3 9 

17. I think the church is a parasite on society 11 00 


L Thurstone and E J Chave, The Measurement of Attitude j p. 61 , 
University of Chicago Press, Chicago, 1929. 
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There are also several important methods of converting ranks 
to a scale having more or less equal umts and an arbitrary zero 
point that are outside the scope of this text.^ Probably the 
most scientific is the mathematical method of curve fitting. ^ 

When the concepts of space, time, money, weight, mass, and 
so on, are used in sociology, they are of course amenable to 
accurate measurement by scales already scientifically established. 

8. Discrete Aggregates. — Population aggregates are of great 
importance in sociological studies. It is possible to define these 
aggregates (commumties, neighborhoods, families, and the like) 
so that their number can be counted. We hold that it is also 
possible to measure the size of such aggregates by counting the 
number of individuals that compose them. We do this in the 
belief that the only essentials of measurement are units that 
are equal and interchangeable for a purpose. The sociologist 
finds it more useful for his purposes to measure the size of the 
family in terms of the number of its members than in terms of 
their weight in pounds or their height in inches. The nature 
of a ^^member^^ does not vary from person to person in any way 
that interferes with the purpose. Moreover, since there is no 
point in subdividing a member,'^ nothing is lost because it is 
logically a discrete umt. This idea of measurement can also be 
extended to any other sociological concept that can be broken 
down into parts that are equal and interchangeable for the 
purpose in hand. 

9. The Measurement of an Intangible Quality. — All attempts 
to measure an intangible quality, such as an attitude, must, of 
course, be indirect in type. The classic example of indirect 
measurement in the physical sciences is a thermometer that uses 
the changing length of a column of mercury as an index of change 
in the amount of the intangible quality 'temperature.” In 
the case of the indirect measurement of a quality Y (temperature) 
in terms of an index X (mercury column), there should ideally 

^ See J. P. Guilford, Psychometric Methods, McGraw-Hill Book Company, 
Inc , New York, 1936, P M Symonds, Diagnosing Personality and Conduct, 
pp 86-89, D Appleton-Century Company, Inc , New York, 1931 

* Karl J. Holzinger, Statistical Methods for Students in Education, pp. 
221-224, Gmn and Company, Boston, 1928, C H. Richardson, An Intro- 
duction to Statistical Analysis, Chaps. VIII and X, Harcourt, Brace and 
Company, Inc , New York, 1934. 
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be a perfect straight-line relationship between the two (see Chap. 
X), so that each unit change m X represents a constant amount 
of change in F. But since F is an intangible and cannot be 
directly measured, there is no way of proving that such a relation- 
ship exists between F and X. So, while we may be certain that 
a scale distance of say AX is twice as great as a scale distance of 
2X, we cannot be certain that a scale distance of AX represents 
twice as much of F as does a scale distance of 2Z. A child 
with an I.Q of 120 is probably not just twice as intelligent as 
another child with an I Q. of 60. All devices of indirect measure- 
ment, including the thermometer, are open to this objection. 
But the scientific and practical usefulness of the thermometer 
and of other indirect measuring devices suggests that for many 
purposes this is not serious. Usually the important things are 
rather that the same absolute reading on the X scale shall always 
represent the same amount of the intangible quahty F, as 
verified by introspection or by some external result m which we 
are interested {eg 32"’F., water freezes) ; and that the X scale 
shall be able to differentiate changes m F small enough for our 
purposes. We shall then know what to expect from F when 
the scale registers a certain value of X. If the relationship is 
close enough to permit a useful prediction of F from the reading 
on the X scale, the latter may still be valuable and is not to be 
discarded until a better index is found. 

In practical scale or score-card making, where there is an 
attempt to measure an intangible quahty F in terms of a tangible 
index X, it is often helpful to set up a ^^fundamental interval” for 
subdivision. This is done by selecting two extreme observable 
instances of F, marking the values of X corresponding to them 
‘^0” and, say, ‘^100” respectively, and dividing the included 
range of X into 100 equal units. In the case of one thermometer, 
the extreme instances of temperature are taken at the melting 
point of ice and at the condensing point of steam. As a parallel, 
in mental testing, for certain purposes we might regard inability 
to pass the first grade in school as indicative of zero intelligence, 
and ability to finish the umversity with honors as indicative of 
100 per cent intelhgence, and represent intermediate degrees of 
intelligence by scores between 0 and 100 As with most ther- 
mometers, for many purposes the zero point need not denote an 
absolute zero, and the upper limit need not mean the ultimate 
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maximum amount of Y. It is important, however, to make sure 
that the ^'fundamental intervaF’ includes as large a range of 
data as investigators will require. 

When the quahty Y is subjective (e g , happiness in marriage), 
it has already been imphed that there are two ways of testing the 
amount of relationship between it and the tangible index X 
(e,g , a score on the Burgess-CottrelP scale for measuring happi- 
ness in marriage): (1) by comparing the amount of F indicated 
by the X instrument with the subjective judgment of the subject 
or of a competent observer (e g , couples getting high scores 
on the Burgess-Cottrell scale consider themselves happy) — this 
is appropriate if interest centers in the subjective quality as such; 
and (2) by checking the readings of the X instrument against 
certain tangible conditions that are ascribed to F {eg , low 
happiness scores on the Burgess-Cottrell scale are followed by 
divorce more often than are high scores). These are called 
tests of validity. Validity is also established in part by defini- 
tion and agreement, e.g,, the cooperative definition described in 
Chap. IV. Chapin's living room scale, mentioned above, is 
intended to measure the socioeconomic status of the homes to 
which it is applied. The fact that the card has given higher 
scores when applied to upper middle class homes than when 
applied to middle class homes, determined independently, is 
evidence of its vahdity. Its rehability was established when 
different observers used it on the same homes with little variation 
in results. 

Evidently, the indirect measurement of a subjective quality 
must wait upon the discovery of a satisfactory tangible index, 
which IS to be sought among the apparent results or causes of the 
subjective quahty, among the results of common causes, or, 
from a different point of view, among the external aspects of 
the subjective concept. Thus, the expansion and contraction 
of the column of mercury in a thermometer are apparently the 
result of changes in temperature. 

Whether the measurement of an intangible quality by means of 
a tangible index or by means of introspective ratings converted to 
scale values is superior depends upon particular circumstances, 
and especially upon the direction of interest. If possible, both 
should be carried through for purposes of validation. 

1 E. W Burgess and Leonard J Cottrell, Predicting Success or Failure 
in Marriage, Prentice-Hall, Inc , New York, 1939. 
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10. Rules of Measurement. — We summarize below wbat 
are probably the most useful rules of measurement in social 
research. 

1. The quahty that it is desired to measure should be defined 
verbally as clearly as possible in the beginning. But the meas- 
urement of a quahty is also a crucial part of its definition. In 
fact, “what the scale measures” may later be regarded as pre- 
ferable to the verbal definition, as equivalent to it, or as not at 
all equivalent to it, depending on the degree of vahdity estab- 
hshed for the scale and the usefulness of its results. 

2. The purpose of the measurement should be stated or 
understood. 

3. The unit used should be appropriate to the purpose of the 
measurement. 

4. Umts should be equivalent one to another (equal, inter- 
changeable) for the purpose in view; except that in the indirect 
measurement of an intangible quahty in terms of a tangible 
index the equahty of the intangible units is mdeterminate, and 
for many purposes is unimportant. 

If the units of a scale are sufficiently equal for a purpose, it is 
safe for that purpose to add or average them, to interchange 
them, or to claim that, say, two umts represent twice as much 
of the quahty as does one umt. 

For the historian, one year is not equivalent to another; for 
the actuary constructing a hfe table, it is. 

5. The umt should be applied as exclusively as possible to 
the quality defined for measurement, in accordance with the 
purpose stated. 

That is, in measuring a man’s height in inches, we should not 
include his shoes, nor should we measure him m a slouched 
posture. So, in measuring “intelligence,” we should, if possible, 
exclude inequalities of effort. 

6. The unit should be apphed to the entire range in which 
the investigator is interested. 

In applying an inch end-over-end, or an inch scale, to measure 
the height of a man, no part of the total distance that is his 
height should be skipped or measured in other than a single 
straight fine. When a Fahrenheit thermometer registers the 
temperature, however, it reads above or below a fixed point 
that is arbitrarily called zero. This is adequate for ordinary 
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purposes, because most of us are interested only in the range of 
temperature included in the thermometer, and not in an extension' 
of that range to a depth never observed in ordinary experience. 
But for some scientific work, the temperature needs to be 
measured from a true zero point, and a different scale is used. 

The ratio of two measurements holds only with reference to 
the zero point from which they are made. If this is not an 
absolute zero, that fact should not be forgotten when interpreting 
the ratios. 

7. The size of the unit should be fine enough to detect the 
smallest differences that are of importance for the inquiry, but 
need be no finer. 

8 Final judgment of an instrument designed to measure an 
intangible quality should depend chiefly on tests of its validity 
and reliability. 

Summary. — We have seen that even “subjective'^ qualities 
are amenable to a great deal of statistical analysis through 
counting, classification, ranking, and rating They cannot be 
exactly measured unless the form of their theoretical distribution 
is known a priori, or unless they are perfectly correlated with 
some obj'ective index, and it is seldom or never possible to 
demonstrate completely either of these propositions. Neverthe- 
less, such qualities have already been measured in both the 
natural and the social sciences successfully enough to satisfy 
many important scientific and practical uses. Devices like 
the Binet test and like those used to score social attitudes, 
socioeconomic status, personahty traits, and so on, are promising 
approaches to measurement in social research, and their rapid 
improvement and extension to cover many more sociological 
concepts are to be anticipated. Moreover, objective qualities 
in which sociology is interested not only can be counted, classified, 
and the like, but they can also either be measured by the scales 
already standardized by the physical sciences or they should 
offer no diflS.culties that are peculiar to the social sciences. 

Exercises 

1. Is anything more than clearness of definition necessary to render 
data amenable to statistical treatment Illustrate. 

2. Can classification and counting alone form any basis for statistical 
analysis^ Illustrate. 

3. What are the main points to watch in the use of classification? 
Illustrate. 
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4. How does classification differ from rating? Illustrate. 

6. Give an example of the kind and amount of ability a judge should 
have to quahfy as a ^ ‘ rater 

6. Name at least one method of converting ranks to scale values. 

7. Devise a simple graphic rating scale for the personahty trait of 
^^sociabihty.” 

8. Describe some scoring device used m sociology. What is your 
opinion of it as a measuring instrument^ 

9. Distinguish between a scormg device and a scale in the strict 
mathematical sense, 

10. Distinguish between counting and measurement 

11. Illustrate a sociological problem where counting is equivalent 
to measurement. 

12. Discuss the possibility and necessity of equal units in the measure- 
ment of an intangible quahty 

13. WTiat IS of chief importance m the indirect measurement of an 
intangible quahty 

14. WTiat IS meant by the vahdity of a measuring scale‘s By its 
rehabihty^ How can an instrument designed to measure an intangible 
quahty be vahdated'^’ Illustrate 

15 Give an example of an intangible quahty of interest to sociology, 
and describe briefly two ways in which it may be measured 

16. What method of measurement would you apply to answer each 
of the following questions . 

а. Does divorce tend to increase with family income 

б. Do the ablest people leave the farm for the city? 

c. How do 10 cities compare in respect to good governmentf 

17. What IS the reason for taking a number of measurements of the 
same thing and averaging them? 
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CHAPTER III 
FACTOR CONTROL 

Among the social sciences the controlled experiment has been 
employed much less than in the natural sciences. As a rule, 
sociologists have either preferred or felt obliged to investigate 
social situations in all their original complexity and confusion. 
The methods for deahng with this kind of data attempt to 
introduce control by means of classification in the case of attri-- 
butes (unmeasured traits, eg., married, single) and by mathe- 
matical devices in the case of variables, (measured traits, e.g , 
age in years). 

1. The Actuarial Method. — One of the most effective schemes 
of classifying attnbutes is similar in general principle to that 
employed by actuaries in deternoimng insurance risks. ^ For 
example, a large number of paroled cnminals may be sorted into 
relatively homogeneous groups with respect to various criteria, 
such as number of previous arrests, prison record, age, type of 
offense committed, intelligence, and so on, and the rate of 
violation of parole determined for each group. After proper 
testing, these rates may then be used as estimates of the proba- 
bihty of violation of other prisoners who fall in the established 
classifications. 

We begin with a specified group of items, say paroled pris- 
oners from the Joliet (111 ) penitentiary on Jan. 1, 1941. The 
simplest classification is a dichotomy, or separation of the A’s 
from the Not A^s. Thus, our parolees may be divided into the 
married and the not married. If we wish to test whether 
marital status (trait A) is associated with success on parole 
(trait B), we compare the proportion of successful parolees 
(J5's) among the married parolees (A^s) with the proportion 
among the not married parolees (not A's) When there is no 
association, ie., the traits A and B are independent, the two 

1 For a more tliorougli. development of this technique, see G. XJ Yule and 
M G. Kendall, An Introduction to the Theory of Statistics, Chaps I—V, 
Charles Griffin & Company, Ltd., London, 1937. 
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proportions will be the same, except for chance errors. In 
other words, if 80 per cent of the married parolees succeeded, 
but only 60 per cent of the not married parolees did so, we would 
conclude that marital status was favorable to success on parole. 

Suppose we beheve that a good prison record (trait C) also 
makes for success on parole. We test it in the same way as we 
did marital status above, and confirm our belief. It may then 
be worth while to make a double classification of the parolees 
by marital status and by prison record, as shown in Table 1. 
From this table we note that the proportion of successful parolees 
in the group as a whole is ffg- = 0 72, among the married is 
|-^ = 0 80, and among the married with a good prison record is 
ff = 0 93, approximately. On the other hand, among the 


Table 1 — Classification of 500 Paeolbes by Maeital Status and 
Peison Recoed, Joliet, III, Jan. 1, 1941. (Hypothetical Data) 


Outcome 

Parolees, married i 

Parolees, 
not married 

Total 

Record 

good 

Record 
not good 

Record 

good 

Record 
not good 

1 

Successful 

65 

175 

25 i 

95 

360 

Not successful 

5 

55 

10 ! 

70 

140 

Total ... 

70 

230 

35 

165 

500 


not married parolees with a not good prison record, the proportion 
of successes is = 0 58 nearly. Evidently, in future groups 
of parolees chosen m the same way and exposed to the same 
general conditions as were the 500 represented in Table 1, a 
married man with a good prison record may be expected to have 
a much better chance of succeeding than a man not married with 
a prison record that is not good More specifically, for every 
man of the first type that failed, we should expect 6 of the second 
type to fail, out of equal numbers placed on parole. 

It IS, of course, possible to subclassify the cases in Table 1 
still further, either by substituting more complete breakdowns 
for the dichotomies (e g , married, single, divorced, widowed for 
married, not married), or by introducing additional factors 
(e g , employment record before arrest).^ 

1 See Chap. XI 
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2. The Search for Causes, — It is often said that the under- 
lying purpose of all science is prediction. Certainly, scientific 
research constantly seeks to discover causes. Much philo- 
sophical dispute has occurred regarding the nature and reality 
of a cause, but we shall here say only that we mean by a cause 
any factor whose change under controlled conditions is invariably 
followed or accompamed by a change in a second factor. The 
logicians refer to this as concomitant variation. The kind of 
causes with which practical science is most concerned are simply 
factors that give the easiest and most rehable prediction, or 
understanding, of certain conditions that constitute a problem. 
Thus, if we can always change the divorce rate in a given type of 
social situation by changing the proportion of Protestant- 
CathoHc marriages, the intermarriage of Protestants and Catho- 
hcs may be regarded as one cause of divorce under the given 
conditions. In the social sciences, there are always many 
causes that combine to produce any actual situation or result. 
Evidently the divorce rate of a city is the product of a vast 
number of forces, only some of which can be discovered or 
controlled. 

3. Matching Experimental and Control Groups. — The logical 
reqmrements for establishing a causal relationship are the same 
in every science ^ It is always necessary to establish the fact 
of concomitant vanation. For working purposes, the procedure 
is essentially to introduce, remove, or vary in amount the 
suspected cause, and then to observe or measure the correspond- 
ing changes, if any, in the thing that is expected to be affected. 
For example, suppose that we want to test the belief that knowl- 
edge of the evils of alcohol will prevent young people from 
drinking. We expose a number of such persons to appropnate 
instruction and note what proportion of them acquire the habit 
of drinking within, say, a two-year period. In this group, called 
the experimental group, the supposed cause is present. A second 
group of young people, which may be termed the control group, 
is given no instruction, so that the supposed cause is absent. 
After two years, the proportion of habitual drinkers is deter- 
mined in this group also, and the proportions are compared 
between the expenmental and control groups. If the expen- 

1 See John Dewet, Logic The Theory of Inquiry, pp. 101, 462, 491, 509, 
Henry Holt and Company, Inc , New York, 1938 
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mental group shows a lower percentage of drinkers than the 
control group, however, it still cannot be said that the instruction 
made the difference, unless it can also be shown that nothing 
else is hkely to have done so. Thus, it is possible that the 
experimental group contained a considerably larger proportion 
of women or of church members than the control group, which 
nnght make the comparison unfair. It is evidently necessary in 
any experiment that the experimental and control groups shall 
be essentially ahke in all important respects that might affect 
the outcome, except for the factor or factors under investigation. 
This must, of course, be taken care of when the experiment or 
investigation is being planned. The young people in our 
experimental group must have no characteristics, except the 
instruction, that will make them more liable or less hable to 
become drinkers than those in the control group. The usual 
way of trying to insure this equality is to match the two groups 
in respect to every important point that may be related to 
drinking, such as age, sex, family background, church member- 
ship, present drinking habits and attitudes, and so on. Moreover, 
all conditions must remain approximately the same for the two 
groups during the two years that the experiment is under way. 

4. The Principle of Randomization. — In sociological research, 
however, it is seldom that an investigator can feel that his 
experimental and control groups are actually matched m all 
important respects needed to insure a valid comparison between 
them. He is, therefore, obliged to summon to his aid the princi- 
ple of randomization. Having matched his two groups as well as 
he reasonably can, he then decides by a random draw which 
of each pair of matched subjects, or which subjects from the 
total lot, shall belong to the experimental group and which to 
the control group. If this is not feasible, it may be decided by a 
draw which of the two matched groups shall be the expenmental 
one, or this may be done in addition to the above As long 
as there are only two groups, this latter method of randomization 
alone is not very effective. The experiment will be better 
designed if there can be several groups, or replicaUons, half of 
which are drawn at random to serve as the expenmental groups. 
In some cases, indeed, the whole process of matching the groups 
may best be omitted, and dependence placed in subdividmg the 
potential events — e,g , a large number of unselected young 
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people — ^into two or more groups by random selection. When 
any good method of randonnzation is used, all imtial differences 
between the experimental and control groups should be accidents 
of chance.^ 

5. Pretests and Final Tests. — Whatever method of equaliza- 
tion is used, it is well before subjecting the groups to the condi- 
tions of the experiment to test them to see how much alike the 
expenmental and control groups really are in pertinent respects 
This is usually done by means of a pretest, which is the same 
as the final test that will be used at the end of the experiment to 
measure the differences between the groups at that time. Thus, 
in our illustration, we might set up a battery of questions about 
drinking habits that would enable us to decide to what extent 
a young person drank or was predisposed to drink, and if the 
expenmental and control groups scored about the same on this 
test, we might regard them as equivalent for the purposes of our 
investigation. 

6. The Influence of Additional Factors. — It is often desirable 
to test the effects of a third factor on the relationship between the 
independent and dependent factors in an expenment. In this 
case, the third factor is inserted and removed, with only the 
independent and dependent factors present Thus, we might 
observe the influence of sex in studying the influence of instruc- 
tion on dnnking. Both control and experimental groups would 
then be divided by sex, giving four groups rather than two. 

7. The Case of Continuous Variables. — In the illustration 
above, we were deahng with attributes, such as “instruction,^' 
“no instruction," “habitual drinkers," “not habitual drinkers," 
rather than with measured variables, hke the amount of instruc- 
tion and the amount of the tendency to drink. Although there 
is no difference in principle between the two cases, there is 
some variation in procedure Thus, if we wanted to measure 
the amount of the tendency to drink in relation to the amount 
of instruction given, we should take several groups instead of 
only two. To each of the several groups we should give a 
different amount of instruction, including no instruction at all 

^ For a more advanced discussion of this subject, together with the statis- 
tical techniques of analysis of variance and covariance that have recently 
been developed in connection with it, see E F Lindquist, Statzstical Analysts 
%n Educattonal Research^ Chaps. IV—VI, Houghton Mifflin Company, 
Boston, 1940. 
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to one group, and note whether there was any relationship 
between the increasing amount of instruction and the tendency 
to drink after two years. As before, we should have to equate 
the groups in all important respects before experimenting with 
them, or else be prepared to make corrections for the differences. 
Of course, we should have to devise scales for measuring the 
amount of instruction and the amount of the tendency to drink, 
before we could treat these factors as continuous variables. 

8. Interfering Variables. — ^As in the case of attributes above, 
it is usually important to measure the influence of certain inter- 
fering variables. In our dnnking experiment, some of these 
might be the attitude of the parents toward drinking, the sub- 
jects’ ages, their money incomes, and so on. Such variables 
are not matched or randomized out of the experiment, but are 
introduced in varying known amounts, and their effects on the 
independent and dependent variables are measured. Factors 
may then be held constant, or their influence subtracted out, by 
mathematical methods.^ This type of analysis jaelds more 
information and information of a more practical kind than when 
all interfering factors are actually removed by matching or 
are equahzed by randomization; and it is also generally easier to 
carry out. 

Exercises 

1. Illustrate the use of the actuarial technique in the prediction of 
success in marriage. 

2. Explain how you would obtain control over interfering factors in a 
study designed to show the effects of the presence of children on the 
divorce rate, or other problem of your choosing. 

3. Comment briefly on the following published statements: 

a. “ Despite marked advances in appendicitis diagnosis and surgery, 
Wisconsin’s death rate from the ailment, which stood at 11,6 deaths 
per 1,000 population m 1911, nevertheless increased to a rate of 18.2 
in 1930.”2 

b. Women Are Safer Drivers than Men Records Reveal When Mary 
and Jack borrow Dad’s car for a ride, they’ll be smart if the^ let Mary 
do the driving. 

^ See, for example, Mordecai Ezekiel, Methods of Correlation Armlysis, 
Chap XIII, John Wiley & Sons, Inc , New York, 1930, or G. W Snedecor, 
Statistical Methods j rev ed , Chaps XII and XIII, Collegiate Press, 
Inc., of Iowa State College, Ames, Iowa, 1938 

^ Wisconsin State Board of Health Bulletin^ Madison, April-June, 1935, 

p 26 
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“For in spite of the young man’s claim to being a better dnver, state 
highway commission records show that women drivers seldom are 
involved in fatal accidents Young men, however, are involved in 
more fatal automobile crashes than any other age class of motorists 

“Few women drivers are found on state highway commission fatahty 
records, and only one person was killed in the last two years by a girl 
driver under 18 years of age 

“State safety workers won’t argue that Mary is a better driver than 
Jack, but they do claim that state records indicate she is a safer driver ” 

c. HomemaJdng Careers Attracting More Girls' In increasing number, 
girls are turning attention these days to homemaking as a career 

“The popularity of homemaking courses is shown in the increasing 
enrohment m home econonucs at the University of Wisconsin where 
enrollment this fall is nearly 10 per cent above 1936, according to the 
director of the course ” 

d, “There has been more social progress in the Umted States in the 
last 18 years since women have had the vote ” 

c “The Distilled Spirits Institute, demanding that the Anti-Saloon 
League recogmze the prevaihng downward trend of major crimes, bases 
its case largely on this general statement The total (of all crimes) for 
the calendar year 1936 showed a deciease of 112,055 offenses as com- 
pared with 1935 ” 

(Turn in to the instructor two examples of the misuse of statistical 
reasoning chpped from newspaper or magazine ) 
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CHAPTER IV 


THE STATISTICAL INQUIRY 

1* The Role of Nonquantitative Methods. — Access to non- 
quantitative methods, such as the historical method, the case 
study, and the general interview, is not to be denied the statistical 
investigator in sociology. Many of his problems and ideas will 
be suggested by working with materials of these kinds before 
the statistical study is set up. Also, during the progress of the 
collection of the statistical data and analysis of them, he will 
usually find it invaluable to interview or talk with the informants 
and their neighbors, to saturate himself with their points of 
view and backgrounds, and to judge the reliability of their 
replies to formal schedule questions by shrewd observation 
Finally, as suggested in Chap. I, m interpreting his statistical 
findings, some important questions are almost certain to arise 
that cannot be answered from the figures in hand, and he will 
want to go back to the living situations for fresh suggestions. 
The statistical investigator is expected, however, to limit his 
formal conclusions to those arrived at by tested quantitative 
methods 

2. The Problem. — The statistical problem in sociological 
research may vary from what is exploratory and merely fact 
finding to the testing of a sharply stated hypothesis, depending 
upon how much is already known about the subject. We may 
set up a study to find out anything we can about divorce in the 
Umted States, or we may himt the inquiry to testing the hypothe- 
sis that the occupation of the husband plays an important 
part in the situation Exploratory or fact-finding studies 
should be regarded as merely preliminary to more specific and 
better controlled studies, because the former cannot penetrate 
beneath the surface of social phenomena. The problem should 
also be cut to fit the himtations of time, money, and personnel 
qualifications at the disposal of the investigator It should 
usually be a problem of obvious theoretical or practical impor- 
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tance, although, a certain amount of research without apparent 
value but of interest to the investigator should be encouraged, 
because this kind of probing about has sometimes resulted in 
important scientific discoveries. The availability or lack of 
availabihty of reliable statistical data is another consideration 
that will affect the choice of a problem. This bears on the 
point that the problem must be capable of quantification or 
measurement. Above all, the problem should he in the field of 
methodological and informational competence of the investigator, 
but as far as possible outside his field of personal bias. There is 
sometimes a conflict here, as when a Negro sociologist wishes to 
investigate the social conditions of the Negro race. He should 
know the field better for being a Negro, but he is likely to carry 
into the study a racial sympathy that may influence his findings. 
It is very desirable for an investigator to state frankly his biases, 
as well as to do his best to overcome them. 

Of course, no problem should be finally selected until it is 
known to what extent and by what methods it has already been 
studied ^ Although some investigations need to be repeated or 
done differently for confirmation, it sometimes happens that 
a problem has been very satisfactorily solved, and further work 
on it would be a waste of time. What is more likely is that 
certain angles of the problem have been worked out, but other 
angles remain to be investigated. The research worker is, 
therefore, guided by a knowledge of previous work into the 
most profitable channels for further study, and may obtain 
suggestions and warmngs from what others have done. 

In deahng with a statistical problem of the more scientific 
sort, it is indispensable to state the problem as a formal hypothe- 
sis or hypotheses to be tested. Such a hypothesis should be so 
worded that the task of the investigator is made as easy as 

1 Aids in locating previous sociological research on a topic include the files 
of The American Journal of Sociology, The American Sociological Remew, 
The Journal of Social Forces, Sociology and Social Research, and Population 
Index, Soaal Science Abstracts (1929-1932), P. K Whelpton, Needed Popula- 
Uon Research, Science Press Printing Company, Lancaster, Pennsylvania, 
1938, The Psychological Index; Encyclopedia of the Social Sciences, E R. A 
Seligman, ed , The Macmillan Company, New York, 1930, Poolers Index to 
Periodical Literature, Readers^ Guide to Periodic Literature, Annual Magazine 
Subject Index, Booh Review Digest; United States Catalog- Books in Print; 
Cumulative Book Index. 
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possible. It is usually simpler to use a positive hypothesis than 
a negative one, and then to try to disprove rather than to prove it. 
Stnctly speaking, we can never prove a general aflSrmative 
proposition because we cannot examine all possible cases; but 
a single exception may effectively disprove it. Thus we might 
take as a hypothesis, ^^Any association found between the birth 
rate and the business index, with the marriage rate held con- 
stant, is due to chance errors,^' and seek to show that in our 
particular sample it is not due to chance errors. We can only 
disprove, or fail to disprove, such a hypothesis. For practical 
purposes, however, we may regard as provisionally true any 
hypothesis that careful tests have failed to disprove. 

3. Secondary Statistical Data. — Research is a cooperative 
social enterprise, and the social investigator often necessarily 
uses data collected by someone else. The chief sources of 
secondary statistical data that are of interest to sociologists 
are the pubhcations of the various bureaus and divisions of the 
Federal, state, county, and municipal governments, and a few 
private agencies.^ 

^ Important Federal agencies in the United States include the Bureau of 
the Census, the Division of Rural Life and Welfare of the Department of 
Agriculture, the Bureau of Agricultural Economics, the Bureau of Labor 
Statistics, the Children’s Bureau, Pubhc Health Service, Works Projects 
Admmistration, Division of Vital Statistics of the Bureau of the Census, 
National Resources Committee, Interstate Commerce Commission, Central 
Statistical Board, Department of Commerce, Department of the Intenor, 
Federal Bureau of Investigation, National Archives, National Youth 
Admmistration, Tennessee Valley Authority, Women’s Bureau, Umted 
States Employment Service, Immigration and Naturalization Service, 
Agncultural Adjustment Admmistration, Farm Secunty Administration, 
Office of Education in the Department of the Intenor, Office of Indian 
Affairs in the Department of the Interior A current summary of Federal 
agencies, their subdivisions and activities, is available in the Umted Stales 
Government Manual issued by the National Emergency Council A general 
source for the purchase of Federal documents is the Superintendent of Docu- 
ments. An these agencies are located m Washington, D C. 

Information about births, marriages, divorces, deaths, and the public 
health is pubhshed by state bureaus of public health or vital statistics, with 
offices in the state capitals State bureaus of correction, departments of 
education, departments of agriculture, departments of public welfare, plan- 
mng boards, tax commissions, and the like are important sources of data for 
students of social conditions State and private universities and agricul- 
tural colleges also gather and mterpret a great deal of information. The 
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Any serious statistical research project will, of course, soon lead 
far beyond any general summary of sources of data. Much of 
the success of the trained investigator depends upon his ingenuity 
and persistence in discovering the available data that are per- 
tinent to his problem. Intimate familiarity with the field of 
investigation is the best aid here. 

After secondary data are found, however, the investigator 
must examine them carefully and critically before he can safely 
use them for his special purpose. He needs to know (1) the 
defimtion of the thing that is enumerated in relation to his 
purpose, or (2) the definition of the whole that is measured and 
of the unit by which it is measured, (3) the exhaustiveness and 
mutual exclusiveness of the classification, (4) changes in the 
defimtion, (5) the extent of actual over- or underenumeration or 
measurement, (6) the date or period in time to which the data 
apply. 

A few examples may be of help. In the 1935 Census of 
Agriculture in the Umted States, a farm was carefully defined as 

. . aU the land which is directly farmed by one person, either by his 
own labor alone or with the assistance of members of his household, or 
hired employees. A ranch, nursery, greenhouse, hatchery, feed lot, 
or apiary is considered a farm. Establishments keeping furbearing 
ammals or game, fish hatcheries, stockyards, parks, etc., are not con- 
sidered as farms unless combined with farm operations. 

The enumerator was instructed not to report as a farm any tract of 
land of less than 3 acres, unless its agricultural products in 1934 were 
valued at $250 or more. 


Brookings Institution of Washmgton, D C , the National Bureau of Eco- 
nomic Research of New York, the Russell Sage Foundation of New York, 
the Scnpps Foundation for Population Research of Oxford, Ohio, and the 
Gini Foundation of Palo Alto, Calif , are pnvate organizations whose work 
is of value to social mvestigators 

The latest copies of the Statistical Abstract of the United States, published 
by the United States Department of Commerce, the Abstract of the Census 
of the Umted States, published by the Umted States Bureau of the Census; 
and the World Almanac, obtainable at most newsstands, are of frequent use. 
Bibliographies include those of Dorothy C Culver, Methodology of Social 
Research, A Bibliography, and of A F Kuhlman, Public Documents. 

The League of Nations, the International Labor Office, and the Inter- 
national Institute Qf Agriculture publish much statistical material of world 
interest, available in pubhc libraries. 
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A farm may consist of a single tract of land, or of a number of separate 
tracts These several tracts may be held under different tenures, as 
when one tract is owned by the farmer and another tract is rented by 
him. When a landowner has one or more tenants, croppers, or man- 
agers, the land operated by each is considered a farm. Thus on a 
plantation the land operated by each “cropper” or tenant was reported 
as a separate farm. The land operated by the owner or manager, by 
means of wage hands, was hkewise reported as a separate farm. 

That this definition of a “farm” nevertheless did not suit the 
purposes of all users of the census appears from co mm ents like 
the following: 

The census uses a concept of a “farm” which is an arbitrary statis- 
tical definition violatmg any sound reasomng from whatever standpoint 
we may choose. In counting farm operators the census makes no dis- 
tinction between the sharecropper on the one hand, and, on the other 
hand, the farmer who operates his property either personally or with the 
aid of a manager and the tenant who operates a farm — strange as it may 
seem, m current American agricultural statistics the plantation does 
not exist Paradoxically enough, it hves statistically under the dis- 
gmse of its direct competitor and adversary, the small family farm . . . 
nobody knows how many plantations existed m the Umted States m 
1920, 1925, 1930, or 1935.i 

A great many more farms were enumerated by the Census of 
Agriculture m 1935 than m 1930. Between these two censuses 
no change was made in the defimtion of a farm; yet there is 
evidence that the 1935 census counted as farms many plots 
that were not counted as farms in 1930, especially in or near 
mining and industrial areas. The depression and unemployment 
caused the occupants of these plots to give more than ordinary 
attention to gardening, chicken raising, and other home produc- 
tion, and as a result these rural home places were lifted into the 
farm class. Since the famihes and the plots were otherwise just 
the same as they had been in 1930, and the “farmers” added by 
the 1935 census were actually miners and industrial workers who 
would return to their usual employment at the first opportunity, 
it has been felt that the heavy increase in the number of farms 
reported was largely spurious. As usual, however, the error, 
if it may be so called, occurred on the periphery of the defimtion 

1 Kakl Beandt, Fallacious Census Terminology and Its Consequences in 
Agriculture, Social Research^ Vol 5, pp. 19-37, 1938. 



36 


ELEMENTARY SOCIAL STATISTICS 


where the concept defined (a farm) shades off into something 
different (not a farm). Most of the farms added in the above 
manner were quite small, and the value of their products was 
so close to the minimum of $250 that they might easily slip 
in and out of the farm category. The number of farmers 
returned by the census of agriculture is never the same as the 
number found by the accompanying census of occupations. 

In the case of farm laborers, including members of the farmer^s 
family working on the home farm, the problem of definition is so 
dijfficult that not much reliance can be placed in the figures 
furnished by the census. In addition, the census of 1920 was 
taken as of Jan. 1 and that of 1930 as of Apr. 1, and this shift 
of date alone caused a sharp variation m the number of farm 
laborers reported It is well known that the census of popula- 
tion underenumerates young children, Negroes, and other 
classes that for one reason or another are hkely to be overlooked; 
that the reporting of the population by years of age overloads 
the 5^s and lO^s (e g , 15, 20), at the expense of the other years 
(e g , 14, 17, 19, 22) ; and so on. 

Such examples suggest only a few of the many pitfalls that 
lie in secondary data, even when collected by a great national 
agency hke the Bureau of the Census, which may be regarded 
as unbiased and thoroughly honest in those aspects of its work 
that cannot be checked by the consumer of the data The 
dangers are usually much greater in the case of data supplied 
by the smaller pubhc agencies, like those of states or cities, and 
by many private agencies. The best rule is to insist, as far as 
possible, on knowing what was done by the collecting agency 
at each step of the data-gathering process, from definitions to 
field work to final tabulation; and on noting what checks they 
have applied to test the accuracy reliability, and validity of 
their data. Only when the investigator is reasonably satisfied 
after a painstaking scrutiny of this kind that the data are appro- 
priately defined and sufficiently accurate for his purpose is he 
justified in going forward with the work of analyzing and inter- 
preting them. Eesearch workers have wasted months of effort 
and thousands of dollars before they discovered that the material 
on which they were basing their conclusions was hopelessly 
inaccurate to start with. Obviously, no amount of mathematical 
treatment can make amends for data of this kind. 



THE STATISTICAL INQUIRY 


37 


4» Primary Statistical Data. — The usual method of gathermg 
firsthand data m sociological research is by means of the schedule 
or of the questionnaire. Both are sets of questions to be answered 
in blank spaces provided. The questionnaire is mailed out to 
informants and is not often to be recommended. Not only 
are the persons addressed hkely to misunderstand or mterpret in 
diverse ways the questions asked, but they seldom answer all 
of the questions, and many of them make no returns at all, 
thereby tending to produce a biased sample. A much sounder 
plan is to have trained interviewers with a schedule visit the 
persons who are to give the information,^ or transfer the data to 
the schedule from available records. The procedure properly 
begins with the formulation of the problem, and ends with the 
analysis of the data, because one step logically determines 
another, and a given investigation should be developed as an 
orgamc whole. 

6. The Schedule. — ^After the problem of fact finding or 
hypothesis testing and the general approach to it have been 
tentatively determmed, the next step is normally to prepare 
the schedule. The schedule is nothing more than a hst of the 
questions which it seems necessary to answer in order to test 
the hypothesis or hypotheses, or to get the facts at which the 
investigation is aimed. Much skill and labor are required to 
include all the essential questions and nothing more. Anything 
that is obvious or beside the point should be omitted. In 
addition, each question must be simple and clear, and must be 
answerable in terms of countable or counted units; and the 
same question should have approximately the same meaning for 
each informant. The units must be capable of objective defini- 
tion, so that there will be no serious amount of disagreement 
about specific instances. Birth rates, an index of business con- 
ditions, marriage rates, age in years or months, I.Q.'s, ^^male,^' 
^'female,'' ^'yes,^^ “no,’’ dollars, number of persons in family, 
occupation, and so on, are acceptable units when carefully 
defined in context. So much difliculty has been experienced with 
a term hke “occupation,” however, that the census bureau has 
prepared a large manual with a detailed list of almost every 

^ It is possible to mail out questionnaires to carefully stratified classes of 
the population, and to correct the replies in the hght of answers obtamed by 
personal visitation of much smaller samples from each stratum. 



38 


ELEMENTARY SOCIAL STATISTICS 


conceivable occupation, showing its schematic relationship to 
more inclusive occupational categories. 


THE ENUMERATIVE CHECK SCHEDULE 

SPECIMEN FORM EC-4 AND INSTRUCTIONS 

Printed below and slightly reduced from its actual size, is a specimen copy of EC-1 with entries made to illustrate 
typical situations For the persons enumerated, these include a fully emplojed head of family, a housewife, a part 
time worker, a worker temporanlj absent, an unemplojed worker, a new worJ«r, a time student, s retired invalid, 
and a worker on a special Government or emergency project The specimen EC-1 Form as set out is followed b> a 
narrative describing the manner in which an enumerator might receive the answers which are recorded on it. The 
instructions printecT on the back of EC~J are reproduced on the opposite page 
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Thomas E Brown, the enumerator, begins his work on Monday, November 29, 1937 He has been instructed by 
the postmaster and furnished with a package of EC-1 Forms preaddrcssed for each dwelling on the route to which 
he is assigned, as well as a supply of EC-2 notices The first EC-1 Form bears the address 2102 North Lake Street. 
It is not a farm, so he wntes ‘ho” in answer to "B" Mrs Johnson answers the bell, and when Mr Brown has intro 
duced himself, plaining the purpose of his call, she gives him the following information about the members of her 
household 

There are 11 in all, including 2 not yet 14 years old Mr Brown writes in "11” and "2" for "C” and “D", respec- 
tively, and proceeds to list the names of the 9 grown ups, ard then to fill in the answers for each as Mrs Johnson 
responds to the questions 

The head of the house is Philip Johrson, age 56 He was fully employed dunng the week of November 14-20 at 
, , , , v_ . ^ , . , „ ..L ^ . .. , . . , .. Their oldest 

hours dunng 


a regular job She is his wife, has always kept house for the family and does not want work for pay Their oldest 
son, George, is 32 He has a regular job but was put on a part time basis in September and worked only 16 1 

the week of November 14-20 les, he wants more work Helen who is 28 has a job She was out sick .... 

November 14-20, but has since returned to work Arthur, age 24, worked for several years up to last summer when 
■ " ' ' ’ • ’ ■’ ’ ' '* ^ ^ ■ rfly ... 


he was laid off He wanted a job dunng the week of November 14-20 and has been temporarily away from home'ji 
another city trying to find one Peter, ago 20, has not worked before but he, too, is looking for a job Mary, age 17, 
18 still in schooL jPaul Smith, age 80, is Mrs Johnson's father who lives with her He gave up working several years 
ago when his health made it Impossible for him to carry on. Robert Jones, 24, is a roomer who is a laborer on a Y’^PA 
project. 

Mrs Johnson also says that another family, the Smiths, live upstairs in the house As he does not have an addressed 
EC-1 Pona for the Smiths, Mr Brown fills in a blank form for them and proceeds to his interview with Mw Smith. 


A good rule is that each question in the schedule should be 
answerable either in terms of some standard umt like dollars and 
number of members in family (as defined), or in terms of a check 
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mark, code number, or letter that refers to a specific list. For 
example, after the interviewer learns the subject's occupation 
he may enter in the schedule the code number of the appropriate 
classification in the census manual of occupations. Open 
questions, in answer to which any word or phrase may be inserted, 
should be avoided. Thus the question, “To what social organ- 
izations does he belong?" is usually less desirable than a com- 
prehensive list of social organizations to be checked, including 
the catchall “Others," to cover any institutions that may have 
been omitted froih the list. The ability of informants or 
records to furmsh sufficiently accurate answers should be con- 
sidered. Questions that call for more information than is 
hkely to be available, that rely too much on memory or on 
memory of the distant past, that cause fatigue, or that excite 
bias or involve personal interests, either are to be avoided or 
special provisions are to be made to estimate, overcome, or 
correct for the resulting errors. Questions addressed to an 
informant should also be inspected to see if they suggest their 
own answers {eg , “Do you dishke to go to school?"). The 
schedule should not modify the behavior it is mtended to meas- 
ure. Special care should be taken that the schedule is not so 
long as to weary or disgust the informants. If it has to be long, 
more than one interview should be allowed, and the informant 
should be paid or otherwise made to feel that the time given to 
it is worth his while. 

On page 38 is the enumerative check schedule used as a part of 
the National Unemployment Census of 1937. Its purpose was not 
to test any hypothesis, but merely to check the number of unem- 
ployed persons enumerated by the voluntary registration plan. 
It meets all the requirements mentioned above, except that it 
employs a number of questions such as “Does he usually work for 
pay?" the clanty and meamng of winch are not obvious. 

The Enumerative Check Schedule 
INSTRUCTIONS 
Household InformaUon 

A, Location — Give address fully, including apartment number, floor 
number, rear, alley, etc , if necessary to identify the household 

B. Does this household live on a farm ? — Consider as a farm any tract of 
land locally so regarded. 
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(7, Total nuniber of persons in this household — Include all persons living in 
the same household umt, including servants and lodgers, also children and 
others temporarily away from this household 

D. Number less than 14 years of age — Enter total number of persons in 
this household who are less than 14 years of age 

Questions About Each Person 

Name , — Before making the entries in any other column, list the names of 
all persons 14 years of age and over, then check with items “C and “D,” 
above, to account for every person m the household 

Write each name on a numbered Ime, never crowd additional names 
between hnes or at bottom of form For households with more than ten 
members 14 years of age and over, contmue the listmg on a second form, 
repeating the address 

Column 1. Sex — ^Enter M ” for male and “ F for female 

Column 2. Color or race, — ^Enter W” for white, ^‘Neg^’ for Negro, and 
for other Enter persons of Mexican parentage as “white” (W). 
The “other” (0) group mcludes Indians, Chmese, etc 

Column 3. Age at last birthday. — If the exact age is not known, enter the 
approximate age 

Column 4. Was this person working for pay (or profit) during the week of 
November 14 to 20, 1937? — Enter “Yes” for each person who worked for pay 
(salary, wages, fees, commission, supplies, hving quarters, etc.) or who 
worked for profit (in his own busmess, store, or on his own farm) at any time 
durmg the week of November 14-20 Enter “Yes” for each part-time 
worker, even though he worked ordy a few hours each day, or only a few 
days of that week. 

Enter “No” for each person who was NOT working for pay or profit, as 
defined above, at any time durmg that week In addition to persons who 
were totally unemployed, “ No ” should be entered for the followmg classes 
of persons: 

a. Housewives and other unpaid persons engaged only in housework or 
helpmg without pay in a family busmess or store or on the family farm. 

b. Sons, daughters, or other relatives who, without pay, help some mem- 
ber of the household in his work for pay or profit. 

c. Full-time students, and retired or disabled persons. 

d Persons who had jobs but who were temporarily absent from work 
during the entire week because of sickness, strike, vacation, or other similar 
reasons. 


6. The Instructions. — To deal adequately with the definition 
of the terms used in a schedule, it is customary to accompany 
the schedule with a set of instructions, like those that follow 
the check schedule of the National Unemployment Census on 
page 39. A readmg of these instructions will give an idea of 
the extent to which they may improve the accuracy of the 
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returns. In work of this kind there is, of course, always a 
practical hmit beyond which the matter of definition cannot be 
carried. 

7. The Tables. — It is usually impossible to set up a schedule 
with much confidence unless tables to receive the returns are 
made up at the same time. Just what summaiy statistics are 
wanted should be hsted {eg., means, proportions, correlation 
coefl&cients), and the tables needed to compute and exhibit 
them drawn up, together with a transcription sheet or cards to 
which aU of the data will be transferred from the schedules. 

Three of the many tables that were used in connection with the 
enumerative check schedule of the National Unemployment 
Census are shown below. 

Table 2 — Persons Enumerated in Check Areas as Partly 
Unemployed or as Part-time Workers, by Sex and Hours 
Worked during the Week of Nov 14-20, 1937* 

(Data for persons 15-74 years of age) 


Partly unemployed I Part-time workers 



Total 

Male 

Female 

Total 

1 

Male 

Female 

Total 

84,919 

60,944 

23,975 

20,895 

11,986 

8,909 

Reportmg 

82,898 

59,438 

23,460 

12,388 

6,538 

5,580 

None 

105 

72 

33 

23 

14 

9 

1-8 hours 

8,268 

4,848 

3,420 

1,193 

434 

759 

9-16 hours 

20,499 

13,899 

6,550 

2,636 

1,211 

1,425 

17-24 hours . 

30,195 

22,137 

8,058 

3,747 

1,982 

1,765 

25-32 hours . . . 

18,120 

14,028 

4,092 

3,099 

1,808 

1,291 

33-40 hours 

4,896 

3,813 

1,083 

1,303 

849 

454 

41 hours or more 

865 

641 

224 

387 

240 

147 

Not reportmg 

2,021 

1,506 

515 

8,507 

5,448 

3,059 

Per cent reportmg 

100 0 

100 0 

100 0 

100 0 

100 0 

100 0 

None 

0 1 

0 1 

0 1 

0 2 

0 2 

0 2 

1-8 hours 

10 0 

8 2 

14 6 

9 6 

6 6 

13 0 

9-16 hours 

24 7 

23 4 

27 9 

21 3 

18 5 

24 4 

17-24 hours 

36 4 

37.2 

34 3 

30 2 

30 3 

30 2 

25-32 hours 

21 9 

23 6 

17 4 

25 0 

27 7 

22 1 

33-40 hours 

5 9 

6 4 

4 6 

10 5 

13 0 

7 8 

41 hours or more 

1 0 

1 1 

1 0 

3 1 

3 7 

2 5 

Median 

19 3 

19 9 

j 17 7 

21 0 

22 5 

19 3 

* From Dedeick and. Hansen, Ftnal Report on Total and Partial Unemploy mentt 1937, 

Vol IV, p 31, The Enumerative Check Census, Census of Partial Employment, Unemploy- 

ment, and Occupations, United States Government Printing OflSice, Washmgton, 193a 
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TiUBLE 3 . — Persons Enumerated in Check Areas as Not Availabus 
FOR Employment, by Sex, Usual Work Status, Desire for 
Work, and Ability to Work* 

(Data for persons 15-74 years of age Percentage not shown where less 

than 01) 


i 

Both sexes 

Male 

Female 

Num- 

ber 

Per 

cent of 
popu- 
lation 

Num- 

ber 

Per 

cent of 
popu- 
lation 

Num- 

ber 

Per 

cent of 
popu- 
lation 

Total not available for em- 







ployment 

608,460 

41 5 

102,991 

14 2 

505,469 

68 3 

Wanting but not actively 







seeking work . 

21,108 

1 4 

9,222 

1 3 

11,886 

1 6 

Usually work 

14,082 

1 0 

7,491 

1 0 

6,591 

0 9 

Do not usually work 

7,026 

0 5 

1,731 

0 2 

5,295 

0 7 

Wanting but unable to 







work 

3,471 

0 2 

2,264 

0 3 

1,207 

0 2 

Usually work 

2,668 

0 2 

1,868 

0 3 

800 

0 1 

Do not usually work. 

803 


396 


407 


Not wanting and do not 







usually work , 

583,881 

39 8 

91,505 

12 6 

492,376 

66 5 


* From Dedrick and Hansen, Ftnal Report on Total and Partial Unemployment, 1937, 
Vol IV, p 33, The Enumerative Check Census, Census of Partial Employment, Unem- 
ployment, and Occupations, Umted States Government Printing Office, Washington, 1938. 


As a result of constructing specific tables, the original schedule 
is likely to be considerably amended and improved, especially 
if a complete set of tables is made covering every important step 
in the treatment to which the data are to be sub3ected, including 
all work tables for the statistical analysis. 

8. Testing the Schedule. — ^After a schedule has been tenta- 
tively constructed, it should be tested for accuracy, reliability, 
and, if necessary, for validity. This applies to each question 
separately and to the schedule as a whole. 

Accuracy may be checked by applying the schedule to known 
data, and noting how closely the returns agree with the a priori 
information. The interviewer employed should have no prior 
knowledge of the data, and should not be aware that a check is 
being made It is also sometimes possible to include in the 
schedule pairs of questions that get the same information in 
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independent ways; but this is usually confined to a few of the 
most important but least rehable questions. 

Table 4. — Gainful Woekers, 1930, and Persons Employed or Avail- 
able FOB Employment in Entjmeeative Check Areas, 1937, by Sex 
AND Race as Percentage of Population* 

(Data for persons 15-74 years of age) 


Year 

Both Sexes 

Male 

Female 

02 

O 

o 

o3 

1 ^—4 

< 

White 

Negro and 
other races 

All races 

C2 

Negro and 
other races 

CQ 

<D 

O 

o3 

o 

Negro and 
other races 

Gainful workers, 1930t 

57 

56 

66 

87 

87 

91 ' 

25 

23 

41 

Employed or available 










for employment, 










1937 

59 

58 

68 

86 

86 

87 

32 

30 

50 


* From. Dedrick and Hansen, Final Report on Total and Partial Unemployment, 1937, 
Vol IV, p 35, The Enumerative Check Census, Census of Partial Employment, Unem- 
ployment, and Occupations, United States Government Prmting OfGlce, Washington, 1938 
t Data derived from Fifteenth Census of the United States, Population, Vol V, p 117. 

Reliability is measured by trying the schedule twice on essen- 
tially the same data and comparmg the results. It is often 
impractical to apply the schedule more than once to the same 
informant without introducing the memory factor or causing 
an undesirable response. Probably the best that can then be 
done is to apply the schedule to two random samples from the 
same universe of informants, and compare the returns. The 
same interviewer or interviewers should be used m each case. 
In all such tests, the differences observed should fall well within 
the range of random sampling error. ^ 

A schedule, a part of the schedule, or one or more questions 
in the schedule, need to be tested for validity when it is not clear 
that they measure what is intended to be measured. This is 
invariably the case when broad concepts are involved. For 
example, if a schedule is designed to discover the number of the 
unemployeds^ in the United States as of a certain date, it is 
advisable to give careful consideration to the matter of validity. 
Whenever a recognized and proved scale for the same purpose 


1 See Chap. XIL 
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already exists, all that is required is to find the amount of agree- 
ment between the returns from the two instruments, as used 
on the same data. As a rule, however, this convenient situation 
does not occur: there is no true criterion by which to test the new 
instrument. 

In many cases the proper approach is simply that of finding 
an acceptable definition. With the help of anticipated users of 
the research, the investigator defines (1) what ^'area'^ of meanmg 
of a term {e g , the ^'unemployed^O should ideally be measured, 
(2) what parts of this area it is practicable to measure reliably 
enough for the purposes of the mquiry,^ and (3) what parts it is 
not feasible to measure. The meaning that should ideally be 
measured is the meaning that it is wanted to measure. The 
investigator then tries to find objective and reliable indexes, 
which, by agreement, cover as much of the desired meanmg as 
possible. The remaining part that is not covered should then be 
clearly recognized by the investigator and his pubhc, and both 
should regard the omission as not serious enough to invalidate 
the study. Of course, the public may sometimes be the investi- 
gator’s scientific colleagues, sometimes social welfare agencies, 
and sometimes the general public Or the investigator may 
merely interpret the interests of the public as he thinks best. 

It will frequently happen that the persons representing the 
consumers of the research will differ in what they want measured. 
In such a case, the choices are (1) to try to include all the desired 
parts of the meaning in a smgle index, (2) to use separate indexes 
for different parts of the meaning, or (3) to omit some parts of the 
meaning, and thereby reduce the number of people who will be 
satisfied with the results 

One advantage of the method of setting up an inclusive or 
ideal definition of the area of meaning to be measured and then 
marking out how much of it the given instrument can reasonably 
be expected to measure is the fact that it may be possible in 
later studies gradually to expand the area measured until con- 
sumers finally agree either that the result is a satisfactory index 
of the total meanmg, or that the part omitted is so intangible 

^ The Census of Partial Employmentj Unemployment^ and Occupations 1937, 
whose schedule is shown above, included persons totally unemployed and 
wanting work^ emergency workers on WPA, NYA, CCC, etc., and persons 
partly employed. 
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and so little agreed upon that it can be disregarded. It is also 
better to know approximately what the instrument does and 
does not measure, t e., how useful it is for its purpose, than to 
say merely that '^it measures what it measures!^' 

If the method of cooperative definition has not been used, or 
if its results are not entirely satisfactory, the question still 
remains whether the index measures what it is wanted to measure. 
Even in the more objective and simple instances, this is not 
always certain. Thus, if we are trying with Thorndike to 
measure the desirabihty of cities as places of residence,^ and 
include urban death rates as an objective index covermg one 
aspect of the concept of desirabihty, we shall need to ask if the 
rates have been standardized for differences in the age and sex 
composition of the city populations, if the out-of-town deaths 
occurrmg in local hospitals have been omitted, and so on, before 
we can be sure that the rates reflect differences m the incidence 
of fatal diseases and accidents between cities. In cases like this, 
the vahdity may be taken as established when our several 
questions are properly answered. But in deahng with less 
objective traits, this may not be enough. Suppose we include 
an attempt to measure the subjective trait of ^Triendliness^^ as a 
further element m the desirability of cities as places of residence. 
By the method of the cooperative definition outlined above, 
we may arrive at a combination of the average number of social 
visits and the percentage of the population belonging to social 
organizations as a tangible index of this subjective quality. A 
potential consumer of the investigation who has been consulted, 
however, may say that he has lived in several cities and found 
the people m some much ^^colder^^ to newcomers than in others, 
and he doubts that the index will show this difference. If the 
consumer from personal experience can classify certain cities as 
“colder to newcomers^ ^ than others, we can apply our index of 
friendlmess and see where it places them. If the results are in 
agreement with his observation, he is likely to accept the index. 
Of course, in such cases, the experience or opinion of a single 
individual is not enough. We should actually need to have many 
persons, representative of our public, rate or score a group of 
cities in regard to “coldness to strangers,'^ and compare their 

1 E L. Thorndike, Your City, Harcourt, Brace and Company, Inc , 
New York, 1939. 
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ratings or scores with the results of our index of friendliness. 
In doing this, we should be careful to choose as raters individuals 
who are well acquainted through actual residence with at least 
some of the cities in question. 

Moreover, the ratings when repeated by the same or like 
groups should give essentially the same results. If the several 
raters show httle agreement among themselves, as may happen, 
no criterion at all will result from this procedure. In that case, 
we may need to face the problem of the average. Probably a 
certain city was actually ^^cold’^ in its treatment of some of the 
raters and not of others. We might then have to devise a 
rehable score that would reflect the proportion of the raters who 
regarded the city as ^^cold,^^ or the amount of “coldness” that 
they experienced there on the average, and relate it to our index. 
Or we might feel it advisable to stratify our raters by socio- 
economic classes {e,g , rich, average, poor), and get separate 
ratmgs from each class The latter plan would require us to 
deal with the whole problem of the desirability of cities as places 
of residence from the point of view of each social class separately, 
which should provide a set of indexes of more value than any 
single index representing a gross average for all classes In 
addition to subjective ratings, we might also set up, preferably 
by agreement, certain objective criteria of friendly or unfriendly 
cities, such as their methods of dealing with unfortunates, that 
are not included in our index, and test the latter against them. 

The final test of such an index, of course, is whether in practice 
it proves more useful than other methods in selecting cities that 
people will actually find desirable or undesirable places of 
residence, in accordance with the prediction of the score card. 

Ingenious ideas can often be used in testing the validity of 
an index For example, if we are measuring attitude toward 
religion, we might see if our scale will place a group of nnnisters 
at the favorable end, a group of atheists at the unfavorable end, 
and average citizens for the most part in the middle In work 
of this sort there are, however, many pitfalls that can be learned 
only from experience. 

Such tests of accuracy, reliability, and validity as mentioned 
above imply that the schedule will be tried out in the field on a 
small scale and carefully revised in the light of the results before 
it, the instructions, and the tables are put in final form This 
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preliminary trial almost invariably leads to some important 
changes, and should rarely be omitted from the routine of 
statistical research. 

9. The Interviewer. — ^After the schedule has been carefully 
prepared and tested, the purpose of the mterviewer or data taker 
is merely to see that the questions are understood and answered 
to the best of the abihty of the informant, or that the right data 
are accurately copied from the proper sources. The less the 
interviewer says or does beyond this, the more dependable the 
returns should be. He must be especially careful not to suggest 
answers to the informant, or to bias him m any way While 
this may seem to be a negative role, it is one that calls for skill 
and judgment. The abihty to induce informants of various 
kinds cheerfully to give accurate information, or to extract data 
without error from complex or confused records, is not common. 

It is often desirable to test the results obtained by each 
interviewer by noting whether an interviewer's returns differ too 
much from those of others reporting similar data. Also, when 
the schedules are edited, certain kinds of errors made by the 
interviewers may be noted The interviewers may then be 
cautioned, or their work may be corrected for the personal 
equation. 

In the gathering of information by schedule, several interview- 
ers or clerks may be supervised by a foreman, or the investigator 
may do all this work himself. In any case, the investigator 
should participate in the actual field or library work at least 
enough to acquire a firsthand knowledge of the conditions under 
which the data were obtained, and a “feelmg" for the data, as it is 
termed. Many an investigation has been saved or lost by the pres- 
ence or absence of the analyst during the data-coUectmg process. 

10. Editing the Schedules. — The schedules filled out during 
each day on a large study are generally sent in to a group of 
editors at the headquarters of the study. Under the direction 
of a chief, these clerical workers look for unfilled spaces, for 
inconsistent answers, and the like, on each schedule. Where 
necessary, a defective schedule is returned to the field or library 
foreman, who m turn hands it to the interviewer or the clerk 
whose initials appear on it. In small studies, the schedules 
taken during the day are often edited by the interviewers them- 
selves each night 
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H. Tabulation of the Data. — ^Edited schedules go to tabula- 
tors, who tally the data from the schedules to the tables, or, 
if machine methods are used, punch them on cards according to a 
code arranged for the purpose. Machines are, of course, faster 
and more economical for large-scale tabulation. 

The chief electrical machines now in use are the card punch 
and verifier, the sorting machine, and the tabulating machme. 
A general idea of what each does may be obtained from the 
following description* 

The first step in the use of a card for a particular record is the desig- 
nation of groups of columns as fields Each field defines a section of 
the card m winch one particular type of information will always appear. 

The illustration following (Fig 3) shows an 80-column card partly 
drawn up into fields Each field is assigned a sufficient number of 
columns to include the largest number of digits which it wall be called 
upon to accommodate. 
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Fig 3 — Eighty-column tabulating card. 


For instance, the greatest number of months is 12 (a two-digit 
number), therefore, two columns are sufficient for recording this informa- 
tion The greatest number of days m a month is thirty-one, thus this 
field too requires only two columns. The year is indicated by the last 
two digits, making two more columns necessary, etc. 

Figure 4 illustrates (a 45-column) card completely laid out for a 
specific job, in this instance a complex (criminological) study 
At this point it IS obvious that all pertinent information must be 
registered in the card in the form of punched holes The perforation 
of these holes is a simple matter. The digits of the numbers to be 
transcribed correspond to the digits printed on the card Thus, to 
show the date Oct 15, 1934, on the card illustrated above (Fig. 3), 
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the card is perforated as follows: 10-15-34. Descriptive information, 
such as the names of persons or products, is generally coded numerically. 
Tabulating cards are perforated by means of an electric punching 
machine. The punch designed for the numerical system has a keyboard 
consisting of twelve keys, one for each punching position of a colunm. 
As a key is depressed a hole is cut and the card advanced automatically 
to the next column to be punched. The automatic features of the 
machine and the simphcity of the keyboard make the transcription of 
written data into punched-hole form easy, rapid and efficient. 
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Fig 4 — Forty-five column card with field headmgs 

When punching has been completed, the cards are usually in miscel- 
laneous order The next step is to arrange them in sequence by some 
desired classification — ^that is, to group them accordmg to some infor- 
mation which IS punched m them. The Card-operated Sorting Machine 
IS used for this purpose. 

The operation of the Electric Sorting Machine is based on the posi- 
tion of the punched hole in a vertical column of the card. As the cards 
pass through the machine a brush contact is made through the hole, 
causing an electrical circuit to be closed This momentary circuit causes 
the card to be directed to a receiving pocket which corresponds to the 
position of the punched hole. For example, a card punched ‘'9'’ in 
the column under consideration is directed to the 9 pocket, a card 
punched ^'6'^ in the same column is directed to the 6 pocket, etc. . . . 

The automatic sort is made on one column at a time. It is apparent, 
therefore, that to arrange a group of cards in numerical sequence accord- 
ing to the data punched in a three-colu mn field, the group is passed 
through the sorting machine three times. The sort is made first on the 
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units column, then on the tens column and finally on the hundreds 
column The Card-operated Sorting Machine is entirely automatic 
and operates at a speed of 400 cards per minute 

The third step in the Punched Card method is the automatic com- 
pilation of the data into printed reports. This is accomphshed by the 
Electric Tabulating Machine which is a combined adding, subtracting 
and printing machine. Punched cards passing through this machine 
actuate the various adding counters and printing mechanisms — again 
by means of electrical contacts . . The machine is entirely automatic, 
operates at a speed of 150 cards per minute . . . ^ 

12. Analysis of the Data. — The analysis of the data should 
proceed along the lines laid down m planning the study, although 
any minor modifications or extensions that later appear advisable 
may be made This means that the data have already been 
put m work tables for computing means, percentages, standard 
deviations, correlations, or whatever other statistics are needed 
for simplifymg and interpreting the findings. After these 
statistics have been worked out and their accuracy has been 
carefully checked, the investigator should state the results as 
simply, briefly, and clearly as he can Only a few of the most 
vital tables should be presented with the text of the report, all 
others that seem desirable being placed in an appendix Where 
graphic devices promise to be effective, they can be introduced. 

Perhaps the most important things to keep in mind at this 
crucial stage of an investigation are to limit the conclusions to 
what the data show, while yet seeking to use enough imagination 
and insight to discover all of the pertinent information that may 
be extracted from the findmgs There are, of course, no rules 
by which this can be done. Everythmg depends upon the 
ability, integrity, training, and persistence of the analyst 

13. The Amount of Error of Observation or Record in Sta- 
tistical Results. 2 — The readers of sociological studies are not 
unreasonable when they express an attitude of skepticism 
toward the elaborate precision of some of the statistical tech- 
niques that are frequently applied to social data of doubtful 
character. 

1 Herbeet Aekin, m G. W. Baehne, ed , Practical Applications of the 
Punched Card Method in Colleges and Universities, pp. 4-8, Columbia Univer- 
sity Press, New York, 1935. 

2 Adapted from a paper by T C McCormick, On the Amount of Error m 
Sociological Data, American Sociological Review, Vol 3, pp 328-332, 1938 
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The major difficulties involved in the estimation of errors of 
observation are practical rather than mathematical and theo- 
retical in nature. Determination of the accuracy of findings is 
first a question of funds and time, and is tied up with administra- 
tive policies. 

In England, the importance of estimating errors of record in 
sociological results has been recogmzed by Arthur L. Bowley, 
who writes: 

If we do not know of the existence of biassed errors, which in reahty 
pervade our estimates, there is no remedy, if we know them, we are 
likely to obtain more accuracy by the most erroneous corrections for 
them than by neglecting them . . In the nature of things, when we 
are deahng with errors we do not know their magnitude, the most we 
can know is their probable and possible extent. We might estimate, for 
instance, the percentage of unemployed in a certain year as 4 5, and 
add, from information m our possession (coming from a study of wage 
bills or the reports of rehef agencies), that we considered this to be 
within 5 of the fact, we should then write the number 4.5 + .5, meamng 
that the error in the estimate as defined above was unhkely to be 
more than 5/4 5 = or 11 per cent, the correspondmg absolute error 
being 5 In such a case we can also give defimte limits. The per- 
centage employed must he between 0 and 100; and if we could actually 
enumerate 1 per cent of the working-class as out of work, and also 92 
per cent as in work, we should know that the number required was 
between 1.0 and 8 0 per cent, and the maximum error in our estimate, 4.5, 
was 3 5/4.5 = or 78 per cent. Even this is more precise than the 
original statement, the percentage is 4 5, error unknown By further 
investigation we might perhaps bring the hunts of error nearer to each 
other, and decide that it was practically certain that the percentage 
required was between 3.5 and 4 5, then we ought to say ‘Hhe number 
unemployed is .04 . . . of the working class, the estimate being correct 
to the last figure given.^^ This statement is of the same nature as, “The 
body weighs 15 lb 3 oz., correct to an ounce.’’^ 

As yet, most of the theory underl3nLng the subject of errors 
consists of a number of precautions that simply need to be borne 
in mind and observed What seem to be the outstandmg points 
are briefly summarized below. 

^ A L. Bowlet, Elements of Statishcs, 6th ed., pp 180, 181, 192, Charles 
Scnbner^s Sons, New York, for P S. King & Son, Ltd , London, 1937. 



52 ELEMENTARY SOCIAL STATISTICS 

(1) By definition, ^'The relative error in an estimate is the 
ratio of the difference between the estimate and the true 
value, to the estimate/^ 

(2) Where the necessary a priori information exists, the 
results of an investigation may be compared with expecta- 
tion, and the extent of the error suggested in this way. 
The basis of the expectation must, of course, be justified 

(3) In the absence of adequate comparative data, the only 
possible method of findmg errors of measurement or 
record is to repeat the measurements, or a sufficient propor- 
tion of them. These check measurements may be made 
with the same measuring mstruments, or by other devices 
and approaches, to reveal possible errors due to a particular 
method or scale. A change of personnel to find the 
amount of error attributable to the “personal equation^^ 
is also important. 

(4) WTiere differences between the original and the check 
measurements are found, investigation should contmue 
until it is possible to correct the error sufficiently for the 
purpose in hand by averaging or other estimate. 

(5) There are two well-known kinds of error of measurement 
or record, whose treatment is different: 

a. Unbiased or compensatmg errors. Some errors occur 
in opposite directions, and so wholly or partly cancel 
out in sums, averages, and other statistics. Such 
random errors, however, increase the value of the 
standard deviation and attenuate the correlation 
coefficient.^ 

5, Biased errors, or errors in the same direction: 

(a) Constant error. An error that remains the same 
from one measurement to the other, as when a foot 
rule is inaccurately divided, is usually hard to 
detect, but very common. In social investigation 
it may be due to wishful thinking, to loose definition, 
to falsification on the part of the subjects mter- 
viewed, and so on. 

(b) Accumulative error. Some biased errors increase 
from measurement to measurement, as when one 

^ See Chaps VIII and X for definition of these terms. 
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is dealing with more and more difficult material. 
Thus, m taking the census, it is less easy to get 
accurate answers to certain questions from Negroes 
than from whites. 

(c) Irregular noncompensating error. When measure- 
ments vary erratically, so that they affect sums 
and averages in important but unpredictable ways, 
the error must be estimated or eliminated in each 
separate measurement. 

Apart from ingenuity and perseverance, there is no formula 
for finding such errors as these. Where they are suspected 
but not discoverable, it may be advisable to express results 
in the form of ratios, since biased errors are reduced in ratios and 
index numbers. As Bowley puts it, ^^The error in a ratio is 
approximately the difference between the errors in its two 
terms. . . . 

In so^cial investigation it is especially important to avoid 
misleading accuracy of statement, such as carrying calculations 
based on crude data to two or three decimal places. The problem 
of how far not to carry significant figures should invariably be 
solved on the conservative side, as when, in rough population 
estimates runnmg into the millions, even the tens of thousand 
places are given to zeros, and the hundreds of thousands are 
rounded off. 

The final statement of an average or other statistic should 
include the maximum amount by which it may reasonably be in 
error, expressed as a percentage of the value of the statistic, 
as already mentioned above. For example, given the annual 
church attendances per individual, 58. The error of record in 
this figure is estimated to be 10 per cent. These facts may be 
expressed in some such form as 58 ± 10%. 

As Bowley warns, it sometimes takes longer to estimate the 
approximate amount of error in the results of a study than it 
does to make the study itself. If sociologists give proper 
attention to the accuracy of their findings, therefore, they are 
certain to be forced by the interests of economy of time and 
money to simplify their problems and to investigate the same 
population as often as feasible. This is true if accuracy is 
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regarded as a purely relative thing, which need be no greater 
than is required to obtain a satisfactory answer to a question in 
hand 


Exercises 

1. What use may be made of nonquantitative methods in statistical 
research*? 

2. Make a list of the main requirements of a well-chosen statistical 
problem, and give illustrations of what you consider good and poor, 
with your reasons. 

3. With which of the chief sources of secondary statistical data in 
the Umted States are you acquainted*? 

4. Select from the latest Umted States Census a few defimtions that 
seem to you (a) satisfactory, Q>) unsatisfactory, and explain why you 
think so. 

6. What are some of the most unrehable counts in the United States 
Census of Population, and why? 

6. Collect instances of studies in which a questionnaire was mailed 
out and report on the proportion and representativeness of the returns 
received 

7 . a. In the statistical laboratory, propose problems on a competitive 
basis; and after a problem has been chosen, help design a study which 
your class in social statistics will carry out as a semester^s project. 

6. Does the problem satisfy the requirements that you listed under 
question 2 above? 

c Indicate by which of the methods described m Chap II the most 
important traits or factors concerned in this study will be measured, 
and show that no more exact measurement is feasible 

d. What IS the dependent variable*? 

e What are the mam independent variables? 

/. What are the important interfering factors*? 

gr. How will the interfering factors be controlled? 

h Is the sample adequate in size? 

L What IS your assurance that it is representative? 

j Does the schedule meet the demands mentioned in this chapter? 
Review the points 

k. Do you have all the tables that will be needed for computation 
and exhibition purposes, and for interpretmg the data? 

l. Do your instructions leave any important terms undefined, or 
any procedures unexplained*? 

m. By what methods do you propose to test the reliabihty and, if 
necessary, the vahdity, of your schedule? 
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n. To what extent have you used the method of cooperative definition 

to improve the vahdity of your indexes? 

0 . Will you try to measure the error due to the personal equation of 

the interviewers 

p. How will you estimate the amount of error in your final results? 
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CHAPTER V 

TABULATION OF FREQUENCY DISTRIBUTIONS 


1. A Problem. — Before large groups of figures of any kind 
can be studied and interpreted, they must be arranged, or 
tabulated, in some orderly and meaningful way. 

As a first exercise in the tabulation of statistical data, let 
us investigate the sizes of sibling fam il ies from which the students 
at a given college come A defimhon is needed. What is meant 
by sizes of sibling families’’? Let us say that we mean the 
number of brothers and sisters, includmg the student. The 
sibling fanuly, then, is the thing to be measured, whale a sibling 
IS the unit of count or measurement. Are siblings deceased to be 
counted? WLat of siblings married and moved away? What 
of adopted siblmgs, or other children not brothers or sisters 
reared in the family? Always such questions of definition 
of the thing to be measured and of the unit of measurement 
arise in the begmmng of a careful inquiry, statistical or otherwise, 
and must be settled with the purpose of the mvestigator in view. 
In the present case, let us say that deceased siblings, siblings 
away from home, and children adopted or reared as siblings 
in the family shall be mcluded. 

Assuming that we have defined the thing to be counted or 
measured, a sibling family, and the umt of count or measurement, a 
siblmg, and that the units are equal and equivalent for our pur- 
pose, we then ask each student to tell the size of his sibling family. 
Let us imagme that 200 students give the following sizes of sibling 
fanuhes. 

2241 31 12 672 

3152341225 
2614123363 
3231 122234 

1 122223822 

5823 1253 15 6 

2311 39 1822 

3143341222 
3221223736 
59 
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4 2 5 2 

1 7 6 9 

13 11 

2 6 3 5 

2 5 3 3 

2 4 5 1 

3 2 14 

4 11 3 1 

1 3 2 1 

3 2 14 

4 4 14 


1 2 5 

1 1 1 

3 3 1 

15 2 

2 5 5 

14 1 7 

4 4 1 

12 3 

4 4 1 

6 2 5 

4 13 


3 2 8 

12 5 4 

5 2 2 

2 3 5 

2 3 2 

3 6 2 

2 2 6 

6 3 4 

6 2 2 

3 4 2 

4 4 10 


2. The Frequency Distribution : Discrete Variable. — We have 
here 200 values, varying from 1 to 15 So far, the answer to our 
wish to know the sizes of sibling families to which the students 
belong is rather confusing. The rangCj or spread between the 
smallest and the largest values is the clearest bit of information 
we have. It extends from 1 to 15, and is therefore 14. We 
should also like to know how many families of each size there are. 
As a preliminary step to this end, it is convenient to put the 
items in the form of an array , which means merely putting them 
in order of size. 


Table 5 — Abeat of Values 
1111222222333344556 8 
1111222222333344556 8 
1111222222333444566 9 
1111222222333444566 9 
1111222223333444567 10 
1111222223333444567 11 
1111222223333445567 12 
1111222223333445567 12 
1111222223333445568 14 
1112222223333445568 15 


As a rule, however, a better form of the array is the frequency 
array. It is obtained from the original data by setting up a 
consecutive series of numbers covering aU the observed values 
(here, the sizes of sibling families, 1, 2, 3, etc ) in the left-hand 
column of Table 6, and tallying in the right-hand column the 
number of times each consecutive value (size of family) occurs.^ 
The latter figures are termed frequencies. 


^Tallying is commonly done by making in the proper row a slopmg 
stroke for each item (e g , family) imtil four strokes are made, tben drawmg 
a stroke through them for the fifth item* ///. The taJhes 
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Table 6. — Frequency Array op Values 


Size of 
Sibling Family 
1 
2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

Total 


Students Reportmg 
Frequencies 
39 
55 
38 
24 
16 
12 
4 
4 
2 
1 
1 
2 
0 
1 
1 

200 


This gives the same information as Table 5, but in a much 
more compact form. Table 6 also satisfies our curiosity relative 
to the number of students reporting each size of sibling family 
We see at once that most students are members of families of 
three or fewer siblmgs. 

We shall next try lumping together into classes, or class 
intervals, more than one size of family, with the double purpose 
of showing more smoothly how the students are grouped with 
respect to size of family, and of more easily calculating averages^ 
and other statistics from the table. Imagine combining into 
classes family sizes 1 and 2, 3 and 4, 5 and 6, 7 and 8, and so on. 
We then get Table 7. 

The work of combining the frequencies should be carefully 
checked by repetition, and it should be noted that the total is 
the same as for Table 6 


are next counted, and a figure representmg the total number of items 
m the row is entered m the frequency column of the table The work of 
tallymg should be repeated, as a check, and the total should agree with the 
number of origmal items (siblmg famihes) When machme methods of 
tallying are used, the sortmg machine counts the frequency m each class, 
and the resultmg totals are simply read off and entered m the table. 

1 But the average found from Table 7 will be less accurate than that found 
from Table 6. 
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Table 7 — ^Peeqxjency 
Size of 

Sibling Family 
1 and 2 
3 and 4 
5 and 6 
7 and 8 
9 and 10 
11 and 12 
13 and 14 
15 and 16 
Total 


Distbibution of Data 

Students Report- 
ing (Frequencies) 
94 
62 
28 
8 
3 
3 
1 
1 

200 


Table 7 is still more concise than Table 6 and the distribution 
of the frequencies is more regular There are no classes of zero 
frequencies, but instead a rather steady decline m the number of 
cases as the size of family increases, which is what one would 
expect. 

Tables 6 and 7 are called simple frequency distnhuhons, or 
merely frequency distributions , because they show the frequency of 
occurrence of a set of values arranged m order of size. Table 6 
was also called a frequency array because successive class values 
increased by single umts. 

In Table 7 the question arises, What is now the size of family 
in each class? In the first class, is the size of family the average 
of 1 and 2 == 1.5? This is more reasonable than to say that the 
size is either 1 or 2. But how can a family consist of one person 
and a half person? Is not this taking liberties with the data? 
The trouble is due to the circumstance that we are dealing with a 
discrete series, ^ e , a series that can take only certain values (whole 
numbers) and no intermediate values. Thus a siblmg family 
may contain 1, 2, or 3 members, but not 1 3, 2 7, or 3 6 members, 
because people always come in wholes ^ In contrast to a discrete 
series is a continuous series, in which the variable^ may assume 
any whole or decimal value whatever. The ages in years of the 
students in a sample represent a continuous series* 19 3, 20 4, 
20.6, 21.2, 21.7, 21 9, 22 1, 22 5. WTiile a continuous series can 
always be mathematically averaged without logical offense, 
this is not true of a discrete series. For example, the ages in 
years of five students are 19.3, 20 4, 21 7, 21 9, 22.1, and their 


1 A quality {e,g , siblmg family) that vanes in size or amount. 
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average (arithmetic meanO age is 21.08, which is a possible value. 
But if five sibling families are of sizes 1, 2, 2, 3, and 5, respec- 
tively, the mean is 2 6, which is a fictitious value. We are thus 
faced with the dile mm a either of disregarding the logical nature 
of a discrete senes, or of abandoning the attempt to analyze it 
in terms of averages and other mathematical concepts. Since 
the purpose of an average is to simplify and represent a series, a 
fractional value may serve this end in the case of a discrete series, 
even though it is not strictly realistic, and many valuable facts 
can be discovered m this way that otherwise would not appear. 
For these reasons, discrete series are usually throwm into fre- 
quency distributions and treated in some ways as if they were 
continuous. 

Eeturning now to Table 7, we may regard the average value 
of the two sizes of families grouped together in each class as the 
mid-point of the class {eg., for the first class of Table 7, the 

mid-point is — = 1.5). When any item is placed in a class 

with other terms, it is understood that it thereupon exchanges its 
original value for that of the mid-point of the class. For exam- 
ple, when a family of 4 siblings is placed in the class 3 and 4 in 
Table 7, the 4 is thereafter treated as if it were 3 5. The mid- 
points of any class should, therefore, always be as close as 
possible to the true average of the items included in the class. 
From Table 6 we see that the true weighted^ mean size of the fam- 

•1 4^ 1 ^ o • (39 X 1) + (55 X 2) 149 , 

ihes of 1 and 2 siblings is ^ = 1.585, 

94 94 ^ 

whereas our mid-point is 1 5. This is rather close agreement, 
and may be satisfactory for our purposes. The mid-points of 
other classes may be similarly tested. From Table 6 it can be 
seen that the nnd-point of the first class in Table 7 is too small, 
because there are more families of 2 than of 1; but this is some- 
what offset by too large a mid-point in the next class; and so on. 
Where one error balances another in this way, the accuracy of the 
mean found from the table is improved, although the mid-points 
of some of the classes may not be too good. Eecasting Table 7 
in mid-point form, we have 

1 (19 3 -h 20 4 -h 21 7 -h 21 9 -h 22 l)/5 = 21.08. See Chap VH. 

^ In the weighted mean, each value (e g , size of family) is counted as 
often as it occurs 
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TABiiB 8 . — ^Fheqttenct Disteibtjtion of Data: Mid-point Foem 


Size of sibling family, 
mid-point (X) 

Students 

reportmgC/) 

Product 

(m 

1 5 

94 

141 0 

3 5 

62 

217 0 

5 5 

28 

154 0 

7 5 

8 

60 0 

9 5 

3 

28 5 

11 5 

3 

34 5 

13 5 

1 

13 5 

15.5 

1 

15 5 

Total 

200 

664 0 


If we calculate tlie arithmetic mean from Table 8, we may 
compare it with the true mean found from Table 6 To find 
the mean, we multiply each mid-point by its frequency, sum 
the products, and divide by 200. This gives for the data of 
Table 8 a mean of 3 32, and for Table 6 a true mean of 3.315, 
which in this case are nearly identical. We may, therefore, 
approve Table 8 as far as this test is concerned. 

In the case of the data on sibling families, we need to show only 
the lowest and highest whole numbers that can fall within a 
class, because we are dealing with discrete or whole numbers. 
These upper and lower limits of a class are called class krmts. 
We may set up the stub, or first column, of the frequency dis- 
tribution as shown in Table 7, or, if we prefer, we may write 1-2, 
3-4, 5-6, and so on. Frequency distributions are usually given 
in class limit rather than in mid-point form, but the latter is also 
common. The former is better suited for tallying, the latter for 
computing purposes 

3. Selection of a Class Interval. — The suggestions usually 
given to aid in choosing a class interval for untabulated data are 

1. Note the range of the data, i c., the difference between the 
largest and smallest values of the variable 

2. Decide about how large the interval has to be to make a 
significant difference in the data For example, a difference of 
less than five points in a distribution of students' grades would 
seem to be of no consequence, since most teachers make no 
attempt to grade closer than that. Indeed, 10 points may 
seem to some sufficiently close. 
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If the values already have some natural spacing, the latter 
should often be taken as the interval. For example, the size of 
farms in certain regions tends to be a multiple of 40 acres: 40, 
80, 120, 160, etc. 

3, Consider how many class intervals would result if the size 
of interval tentatively chosen in (2) above were divided into 
the range found m (1). As a rule, from 10 to 20 intervals are 
desirable, although, of course, more or fewer are permissible. 
Revise the size of mterval suggested in (2) somewhat, if it seems 
advisable. 

4. Make all intervals of equal size, if feasible, and avoid 
open end intervals when possible. 

6. Decide tentatively upon the mid-points and class limits of 
the intervals. Unless difficulty in classification is introduced, 
the mid-points should be whole numbers for convenience in 
computing, and if they can be multiples of 5’s or lO's, so much 
the better. 

6 Tally the data in the class intervals chosen. Note whether 
the resulting distribution reveals a smooth trend m the fre- 
quencies from one end of the scale to the other, avoiding an 
irregular, broken effect. If too large an mterval has been used, 
some points of interest relative to increase or decrease of fre- 
quencies will be concealed. If the interval is too small, the 
distribution will lack smoothness. It is often necessary to try 
tabulations by larger and smaller intervals to decide these 
points. 

7. The accuracy of the class interval chosen for computation 
purposes should be tested by calculating the arithmetic mean 
from the table and comparing it with the true mean found from 
the ungrouped data or from a large random sample of the 
ungrouped data. To obtain a class interval that will give 
maximum accuracy it is often helpful to use a sliding scale 
device like that illustrated below (Fig. 6, applied to Fig. 5). 

The apphcation of these suggestions will be illustrated. 

Below is a list of the final grades of a class in statistics: 

80 81 87 83 94 
94 85 78 82 85 
85 81 87 87 65 
88 81 80 75 70 
73 63 77 80 88 
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78 73 68 79 
68 83 70 83 
72 84 74 88 
76 90 85 88 

Ordinarily it is not worth while to set up a frequency distribu- 
tion for 41 cases, but a small number is used here for convemence. 

The first step is to arrange these values in order of size, to 
form a frequency array. 


Table 9 — ^Feequency Aebay of Values 


Grades 

( 5 :) 

Frequency 

(/) 

Grades 

(X) 

Frequency j 
(/) 

63 

1 

79 

1 

65 i 

1 

80 

3 

68 

2 

81 

3 

70 

2 

82 

1 

72 

1 

83 

3 

73 

2 

84 

1 

74 

1 

85 

4 

75 

1 

87 

3 

76 

1 

88 

4 

77 

1 

90 

1 

78 

2 

94 

2 

Total 



41 


The range is 94 — 63 == 31, Intervals of less than five do not 
seem justified by the accuracy of the data A natural grouping, 
or tendency for the grades to cluster about multiples of five, 
would be expected. There would be only three or four intervals 
of 10, which seem too few. A trial interval of five, with mid- 
points at 65, 70, 75, etc , is shown in Table 10 These mid- 
points are especially appropriate, because the clustering of 
the grades around them should mcrease the accuracy of the 
table for computing averages and other statistics, as illustrated 
above. Narrow class intervals are also generally more accurate 
for computation purposes than are wide ones. Like the data 
on siblmg families, these percentage grades are given only in 
whole numbers, and so may most conveniently be regarded as 
discrete. If 65 is taken as the mid-point of an interval of 5, 
evidently the lowest grade that belongs in this interval is 63 
and the highest is 67, so that the five grades 63, 64, 65, 66, and 
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67 are included If the width of the class interval were an 
even number, such as 4, instead of an odd number, such as 5, 


X X 

XX X X XX 

XX X xxxxxxx 



Fig 6 — Data of array on page 66 plotted on unit scale 



Fig 6 — Sliding scale, with trial interval of 4. 


the mid-points would be forced to take a decimal value, as was 
the case in Table 7. 


Table 10 — Feequenct Distribution of Students’ Grades 


Grades 

(X) 

L* 

Frequency 

(/) 

/X 

65 

63-67 

2 

130 

70 

68-72 

5 

350 

75 

73-77 

6 

450 

80 

78-82 

10 

800 

85 

83-87 

11 

935 

90 

88-92 

5 

450 

95 

93-97 

2 

190 

Total 


41 

3,305 


* L means class limits 


From inspection of Fig 5, where each value has been plotted 
along the grade scale, it appears that the above choice of class 
intervals throws the mean below the mid-pomt in the intervals 
63-67, 68-72, 73-77, 83-87, 88-92, 93-97 Only in two intervals, 
however, 88-92, and 93-97, is the lack of balance serious. In 
one interval, 78-82, the mean is at the mid-point The true 
mean of the series, computed from the separate values, is 80.195. 
The mean found from the frequency distribution of Table 10 is 
80 610, which is 0 415 too high, as would be expected. If this 
amount of inaccuracy is considered important for the purpose 
in hand, an attempt to obtain a better class interval should be 
made This may be facilitated by makmg a slidmg scale from 
ordinary coordinate paper, using the same units as in Fig 5 
(see Fig 6). Class intervals of different sizes may be measured 
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ofi on the sliding scale, and each tested in turn against the scale 
in Fig. 5. In the case of any given interval, the trial scale (Fig 6) 
is moved along the fixed scale (Fig. 6) until the frequencies 
shown on the fixed scale are as evenly balanced as possible 
around the mid-points of the trial class interval on the sliding 
scale. If satisfactory, the values of the class hmits may then 
be read off on the fixed scale from the intervals on the shdmg 
scale when in this position of balance Usually, some inaccuracy 
is inevitable in the use of class intervals The problem is to 
keep it within such limits that no serious damage will result to 
the conclusions of the study. 

TABiiE 11, — ^Bibth Rates peb 1,000 m 160 Appeoximatelt Equal 

Populations 


Class limits 

(1) 

Mid- 

pomts 

(2) 

Frequencies 

(3) 

12 &-13 4 

13 

3 

13 5-14 4 

14 

15 

14 5~15 4 

15 

26 

15 5-16 4 

16 

31 

16 5-17 4 

17 

43 

17 5-18 4 

18 

25 

18 5-19 4 

19 

5 

19 5-20 4 

20 

2 

Total 

•• 

150 


4. The Frequency Distribution : Continuous Variable . — A con- 
tinuous variable, such as birth rates, is tabulated m the same 
way as a discrete variable, except that a slight modification is 
needed in finding class limits from mid-points, and vice versa. 
In Table 11, given the mid-pomts of col. (2), what are the lower 
and upper values of each class within which the birth rates can 
be classified? The boundary line between any two mid-points 
should evidently be haKway between them — ^Ln this case i of 1, 
or 0 5 unit above the lower or below the higher mid-pomt. We 
thus get as our class limits in col. (1), 12 5, 13.5, 14 5, and so on. 
Notice that the upper limit of any class is made slightly smaller 
than the lower limit of the class just above, to indicate that a 
case fallmg exactly on the border line between two classes is 
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placed in the upper class rather than in the lower.^ AssiTming 
that our ongmal data carry only one decimal place, it is enough 
to write the upper limit of the class 12.5 to 13.5, for example, as 
13.4, but if the data carried two decimal places, the upper limit 
should be written 13 49, and so on. 

To jGbad the mid-pomts, given the class limits, of the continuous 
variable of Table 11, we add the lower limit of an interval to 
the lower limit of the interval next higher on the scale, then average 
them. 

12 5 + 13 5 ^ jg 
2 

13.5 + 14 5 
2 =14 

and so on. 

6. The Frequency Distribution : Nonquantitative Variable.^^ — 

Let us imagine that, instead of adopting quantitative classes for 
sizes of sibling families, as shown in Table 7, we had asked 
the students to state whether or not the size of their sibling 
family was large, medium, or small, without telling them what 
sizes of famihes should be placed in each of the three categories. 
We might then get a table something like Table 12. 

Table 12. — Size of Sibling Families 


Size of 

Students 

Siblmg Family 

Reporting 

SmaU 

, . 116 

Medium 

. . 46 

Large . . 

.... 38 

Total 

200 


We may now call attention to three requirements of classtfica- 
tion that were not mentioned in our previous work, although 
they were tacitly assumed. The first of these is that the cate- 
gories must be mutually exclusive. The second is that they must 
be exhaustive The third is that there must be only one basis of 
classification at a time. 

^ Theoretically, the frequency of a value that is identical with a class limit 
should perhaps be divided equally between the two classes above and below, 
but m most practical work the method suggested above is more convenient 
and sufficiently accurate. 

2 A nonquantitative variable is a quahty that varies in amount, but is not 
measured m terms of units. 
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With respect to the last-named requirement, the basis of 
classification in Table 12 is size of sibhng family There is no 
evidence that any other principle was used in this table A 
question may be raised, however, about the first requirement. 
If we checked to see, we should certainly find that some sibling 
families of three were hsted as small and some as medium, and 
that similar errors were made in the case of families of other 
sizes Moreover, if the sibhng families reported above were 
entered in Table 12 by two independent mvestigators, even 
though neither happened to put families of the same size in two 
different classes, tWe is httle chance that they would both 
classify each size of family in the same class. Some would regard 
a family of three as medium, others would regard it as small. 
Their finis hed tables would not show the same frequencies in each 
class. Because these difl&culties of classification multiply with 
the number of classes, it is usually advisable to have very few 
classes in a qualitative table, e g , three in Table 12. This 
hmits the analysis to broad categories. 

Regarding the principle of exhaustiveness of classification, we 
need to ask: Were there any families that could not be classified in 
one of the three classes of Table 12? Apparently there were not, 
so the table passes this test. 

Can we calculate the mean of Table 12, as we did in Table 8? 
At once the question arises, what are the values of the mid-points 
m Table 12? Since the classes in this table are not quanUtaUve, 
no quantitative values can be assigned to their mid-points. We 
therefore discover that we are unable to analyze a nonquantita- 
tive table by the use of the mean All that we can do is to say 
that the modal class, or the class containing the largest frequency, 
IS that of small families. 

From this illustration, we learn that nonquantitative tables 
not only are likely to violate the logical principle of classification, 
which requires that the several categories be mutually exclusive, 
but that they also do not lend themselves to the calculation of 
the mean and other basis statistical measures by which quantita- 
tive tables are customarily analyzed For such reasons as 
these, quantitative classes are always to be preferred to qualita- 
tive for purposes of statistical analysis The latter should be 
emj)Io 7 ed onlj where quantitative classes Joot obtainabK 
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6. The Frequency Distribution: Table Structure. ^The main 
heading of a frequency table is called the title; the left-hand 
column with its heading, the stub; and the heading of the right- 
hand column, the caphon. These are illustrated in Table 13. 


(Title) Table 13 — ^The Size of Sibling Tajcelies of 200 Students op 
Sociology, Blank College, 1939-1940 
(Stub) Siblings in Family (Caption) Students Reporting 


1 - 2 
3- 4 
5- 6 
7 - 8 
9-10 
11-12 
13-14 
15-16 
Total 


94 

62* 

28 

8 

3 

3 

1 

1 

200 


As far as feasible, a table should be self-explanatory, but 
no unnecessary word or figure should be included. The title 
should usually mention the variable m the stub first; the units 
of the caption and their number, second; and any further sub- 
divisions of the stub or caption It should also generally men- 
tion the date and place. The purpose of the stub and the caption 


Table 14. — ^The Size of Sibling Families of 200 Students, Blank 
College, 1940-1941, by Urban and Rural Residence 


Siblings in family 

Students reportmg 

Total 

Urban 

Rural 

1- 2 

94 

28 

66 

3- 4 

62 

20 

42 

5- 6 

28 

11 

17 

7- 8 

8 

5 

3 

9-10 

3 

2 

1 

11-12 

3 

0 

3 

13-14 

1 

0 

1 

15-16 

1 

0 

1 

Total 

200 

66 

134 


^ More detailed discussion of tbis topic and of tables that do not represent 
frequency distributions will be found m the fifth and seventh references at 
the end of this chapter. 
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is simply to indicate the nature of the entries in the columns 
The ^'TotaF’ row is often placed at the top instead of at the 
bottom of the table. One customary type of ruling is shown in 
Table 14. K it is desirable to block off one part of a table from 
another, this may be done by means of a heavy or double ruling 

The chief requirement of a good table is that it be simple and 
clear. For this reason, it is generally unwise to subdivide the 
stub or the caption very often. One simple subdivision of the 
caption is shown in Table 14. 

In the case of every subclassification of the data in a table, 
the principles of classification already mentioned apply. 

Exercises 

1. Tabulate each of the two following series in a frequency dis- 
tribution, showing class hunts, and test the accuracy of each of the 
tables. Note: The population numbers are so large that only whole 
hundreds or thousands should be used as class hunts and und-points, 
but in finding mid-points from the class hunts the method suggested for 
a continuous variable should be used. An interval at least as small as 
5,000 seems to be needed to differentiate between the bulk of the county 
populations under 40,000. But above that point increasingly large 
intervals are appropriate. The last interval may be taken as “300,000 
and over,” with the actual population of the single largest county, 
318,587, given in a footnote A table may be “broken” to avoid many 
intervals without frequencies. 
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Geoegia Coitnties, 1930 * 


County 

Population 

Population per 
square mile 

County 

Population 

Population per 
square rmle 

1 

13,314 

29 3 

46 

21,599 

50 1 

2 

6,894 

20 9 

47 

18,025 

45 4 

3 

7,055 

26 0 

48 

22,306 

65 2 

4 

7,818 

21 9 

49 

9,461 

45 5 

6 

22,878 

74 5 

60 

18,273 

34 9 

6 

8,703 

43 7 

51 

2,744 

7 6 

7 

12,401 

73 8 

52 

10,164 

22 7 

8 

25,364 

53 9 

S3 

18,485 

51 2 

9 

13,047 

51 0 

54 

24,101 

31 5 

10 

14,646 

32 2 

55 

7,102 

24 7 

11 

77,042 

278 1 

56 

12,969 

32 3 

12 

9,133 

44 6 

57 

8,665 

37 0 

13 

6,895 

15 9 

68 

48,667 

96 9 

14 

21,330 

41 5 

59 

10,624 

43 0 

15 

5,952 

13 8 

60 

15,902 

57 0 

16 

26,509 

39 7 

61 

318,587 

1,650 7 

17 

29,224 

30 6 

62 

7,344 

16 7 

18 

9,345 

46 0 

63 

4,388 

25 8 

19 

10,576 

37 2 

64 

19,400 

44 2 

20 

6,338 

8 9 

65 

16,846 

44 9 

21 

9,903 

46 9 

66 

19,200 

43 2 

22 

8,991 

39 4 

67 

12,616 

30 3 

23 

34,272 

69 7 

68 

27,853 

63 3 

24 

9,421 

55 7 

69 

12,748 

44 0 

25 

4,381 

5 5 

70 

30,313 

69 4 

26 

105,431 

284 9 

71 

13,070 

j 24 7 

27 

8,894 

40 8 

72 

13,263 

46 7 

28 

15,407 

47 0 

73 

11,140 

22 2 

29 

20,003 

46 6 

74 

15,174 

58 1 

30 

25,613 

224 7 

75 

9,102 

31.9 

31 

6,943 

34 2 

76 

15,924 

49 1 

32 

10,260 

72 3 

77 

11,280 

25 5 

33 

7,015 

9 4 

78 

12,199 ! 

32 3 

34 

35.408 

100 3 

79 

21,609 

60.9 

35 

19 , 739 

31 2 

80 

8,594 

26 8 

36 

30,622 

57 9 

81 

8,118 

27 1 

37 

8,793 

25 1 

82 

20,727 

32 1 

38 

11,311 

46 9 

83 

12,908 

37 7 

39 

25,127 

56 7 

84 

12,681 

43 4 

40 

7,020 

22 0 

85 

8,992 

23 9 

41 

17,343 

62 6 

86 

9,754 

63 0 

42 

4,146 

22 3 

87 

5,190 

27 2 

43 

3,502 

16 2 

88 

32,693 

40 6 

44 

23,622 

40 5 

89 

8,328 

25 5 

45 

70,278 

258 4 

90 

8,153 

15 0 


* From the Fifteenth Census of the United States, 1930, Bureau of the Census, Washington, 


■n n 
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Georgia Counties, 1930 * — (Continued) 


County 

Population 

Population per 
square mile 

County 

Population 

Population per 
square mile 

91 

7,847 

27 0 

126 

20,503 

25 8 

92 

4,180 

10 6 

127 

7,389 

30 8 

93 

29,994 

62 1 

128 

23,495 

112 4 

94 

4.927 

17 6 

129 

11,740 

70 7 

95 

9,014 

31 4 

130 

11,114 

27 0 

96 

5,763 

12 3 

131 

26,800 

58 8 

97 

16,643 

50 1 

132 

8,458 

27 1 

98 

14,921 

52 5 

133 

6,172 

29 1 

99 

6,968 

19 4 

134 

16,411 

33 1 

100 

22,437 

45 2 

135 

10,617 

31 2 

101 

9,076 

35 9 

136 

14,997 

40 2 

102 

6,730 

49 1 

137 

18,290 

51 8 

103 

23 , 620 

43 1 

138 

32,612 

61 5 

104 

11,606 

24 7 

139 

16,068 

66 1 

105 

10,020 

52 7 

140 

17,166 

43 7 

106 

12,488 

32 0 

141 

4,346 

24 0 

107 

9,215 

26 9 

142 

7,488 

28 6 

108 

57,558 

244 9 

143 

36,752 

84 5 

109 

17,290 

66 0 

144 

11,196 

48 5 

no 1 

8,082 

47 0 

145 

8,372 

26 7 

111 

12,927 

25 6 

146 

6,340 

19 6 

112 

12,327 

1 38 0 

147 

19,509 

61 5 

113 

10,268 

57 4 

148 

26,206 

60 7 

114 

9,687 

41 9 

149 

21,118 

63 8 

115 

12,522 

36 3 

150 

26,558 

34 4 

116 

10,853 

45 8 

151 

11,181 

27 7 

117 

25,141 

79 3 

152 

25,030 

37 4 

118 

9,005 

34 9 

153 

12,647 

20 6 

119 

8,367 

23 2 

154 

5,032 

16 7 

120 

3,820 

26 5 

155 

9,149 

34 7 

121 

6,331 

16 8 

j 156 

6,056 

24 7 

122 

17.174 

41 7 

i 157 

20,808 

73 5 

123 

72,990 

228 8 

158 

13,439 

33 3 

124 

7,247 

60 9 

159 

15,944 

34 8 

125 

5,347 

34 7 

160 

10,844 

23 0 




161 

21,094 

32 4 


* From the Fifteenth Census of the United States, 1930, Bureau of the Census, Washington, 
D. C. 


2. Subdivide the table of county populations prepared in Exercise 1 
above according to population per square mil e, choosing your own points 
of division in the latter factor. 

3. Open a textbook in elementary sociology to some page at random, 
and classify each word on the page as “Very short,” “Short,” “Aver- 
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age/' “Long/' “Very Long.” Show the results in tabular form. Bo 
the same thing for an elementary textbook in economics, and compare 
the length of words in the two tables. 

4. It IS wanted to know the occupation of the fathers of students 
majonng m sociology. The students are asked to check the form 
below: 

Laborer 

Businessman 

Professional 

Farmer 

Is this satisfactory? 

Where would a carpenter-contractor be placed? A policeman? The 
proprietor of a radio repair shop? 

6. A study is to be made of farm wages m your state. How would 
you define the umt of study*!* 

6. Explain and illustrate the meaning of these terms (a) array, 
(b) range, (c) frequency distribution, (d) class interval, (e) mid-point, 
(/) class hmits, (g) grouped data 

7. What is the effect of tabulation by class mtervals on the accuracy 
of statistics calculated from a table? Why is this*!* 
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CHAPTER VI 

graphs" 

1. Graphs of Frequency Distributions. — It is often helpful in 
interpreting a frequency distribution or other statistical data 
to show the facts in graphic form. One method of picturing a 
simple frequency distribution is by means of the histogram. 
Table 15 may be represented as shown in Fig. 7. 


' Table 15. — Grades Made by 41 Sttjdents of Statistics, Blaitk 
College, 1939-1940 


Grades, per cent 

Students 

Accumulated 

frequency 

Accumulated 

percentage 

frequency* 

63-67 

2 

2 

4 9 

68-72 

5 

7 


73-77 

6 

13 


78-82 

10 

23 


83-87 

11 

34 

82 9 

88-92 

5 

39 

95 1 

93-97 

2 

41 

100 0 

Total 

41 

1 




* Each accumulated frequency is expressed as a percentage of 41, e ^ , A “ 17 


In connection with the histogram, it should be noticed that, 
if the class intervals are taken as one unit each, the area of the 
figure is equal to the total frequency of the table. In Fig. 7, 
for example, 

area = 2X1 + 5X1 + 6X1 + 10X1 + 11X1 + 

5 X 1 + 2 X 1 = 41. 

A second device for picturing a simple frequency distribution is 
the frequency polygon, which is constructed by connecting the 
mid-points of the class intervals of the histogram by straight 
lines. It is shown in Fig. 8. If it is extended to the base line 
at the mid-points of the intervals next beyond the end intervals, 

76 
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and all equal intervals are taken as one unit in width, its total 
area is equal to that of the total frequency of the table, but the 
area over any one interval is usually not equal to the frequency 
in that interval. 



A histogram of the sunple frequency distnbution of Table 16, 
which has unequal intervals, appears in Fig. 9. 


Table 16. — Age Distbibution for Mexicans in the United 
States, 1930* 


Age, Years 

Number (Thou 

Under 5 

21 

48 

5-9 

20 

55 

10-14 

14 

81 

15-19 

13 

72 

20-24 

14 

65 

25-29 

13 

53 

30-34 

10 

11 

35-44 

16 

30 

45-54 

9 

53 

55-64 

. . 4 

60 

65-74 

1 

96 

75 and over 

0 

88 


Total 


142 12 

♦Adapted from Abstract of the Fifteenth Census of the United States, 1930, Bureau of the 
Census. 


If we let each interval of five years on the base line be one unit, 
then, of course, an interval of 10 years will be two units, and the 
height of the rectangle in a 10-year interval will be one-half of 
the tabular frequency in that interval. The end interval, ^^75 

^ Notice that graphs, such as those of frequency distributions, which 
involve two sets of measurements, are erected on the framework of two 
graduated straight Imes drawn at nght angles The horizontal line is 
called the X axis, the perpendicular hne the Y axis. Frequencies are con- 
ventionally measured on the Y axis (but see Fig. 13), scale values on the X 
axis (see Fig. 7). 
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and over,” in Table 16 is of unspecified length, and so cannot 
be accurately represented geometrically. It is accordingly 
omitted from the graph, and its frequency removed from the 
total The sum of the areas of the remaining rectangles is then 
equal to the corrected total frequency of the table. Moreover, 


Y 



6 rci d e s 

Fig. 8. — Frequency polygon of Table 15 

the area of each rectangle is equal to the frequency in the cor- 
responding interval. 

If a polygon is drawn on Fig. 9 in the usual way, neither the 
total area of the polygon nor the area in any interval will be 
equal to the corresponding tabular frequency. The total area 
can be made equal to the total frequency, however, if the polygon 
is drawn to the mid-points of five-year intervals throughout, 
using the same frequencies (heights) as in Fig. 9. 


Y 



Age, years 


Fig 9 — Histogram of Table 16, with unequal intervals 


Notice that if the frequencies were known and graphed for 
each year of age, instead of for each five- or 10-year age interval, 
the rectangles of the histogram in Fig. 9 would become more 
numerous and narrow. If then the frequencies in each year 
were separated by months, we should have still more and nar- 
rower rectangles If this process of subdivision of intervals 
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were continued indefinitely, we should have a smooth curve 
instead of a histogram or a polygon in Fig 9. It is apparent that 
if each minute mterval were then regarded as being one unit in 
width, the area under any part of the smooth curve would be 
equal to the frequency over the same portion of the table (see 
Fig. 10). A great deal of use is made of this fact m some of the 
chapters that follow 

A polygon may be smoothed by passing through it a freehand 
curve This is a somewhat questionable way of judging how the 
distribution would appear if the size of the sample were greatly 
increased. 



Fig 10. — Histogram of Fig. 9 reduced to a smooth curve 

When histograms or polygons are to be compared, they 
should be graphed m terms of percentage rather than absolute 
frequencies 

A very useful type of graph in the interpretation of a fre- 
quency distnbution is the cumulahve curve, or ogive The 
accumulated frequencies for Table 15, formmg a cumulative 
frequency distribution, may be seen in the last column of that 
table. Smce each accumulated frequency merely shows the 
total number of values that are less than the lower limit of the 
class just above on the scale, a frequency distribution m accumu- 
lative form IS sometimes called a less than frequency distribution. 
Plotting should be done carefully on coordinate paper, m order 
that the resulting graph may be accurate enough for computing 
purposes The cumulative frequency curve for Table 15 is 
shown in Fig 11. 
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Notice in Fig. 11 that the frequency in each class interval is 
plotted on the upper class hmit, to show that a particular number 
of students made a grade less than the one indicated by that 
limit. Thus, 34 students iu the given course made grades less 
than 88, and 39 made grades below 93. 

Not only does the cumulative frequency curve give a picture 
of the distribution of frequencies that is different from that 
shown by the histogram or polygon, but it may also be employed 
for interpolation and computation. If, for example, we are 
given Table 15 but know nothing else about the data, and wish 
to change the class li mits of the table, we can sometimes do this 


Y 



most conveniently by means of the cumulative curve. Suppose 
that we want the class limits of 65 to 69, 70 to 74, 75 to 79, 
and so on. How many students made grades f allin g iu each of 
these new intervals? This can be decided approximately by 
erecting perpendiculars at the pomts 65, 70, 75, and so on, on 
the base scale, noting where they intersect the cumulative curve, 
and drawing horizontal lines from these points of intersection 
to the frequency scale at the left. Thus, the horizontals cut 
the frequency scale at approximately the values 1, 5, 10, 18, 29, 
37, 40. We can accordingly set up a new frequency table. 
Table 17, whose last column is obtained by subtracting in the 
second column the accumulative frequency in each clg-gg from 
that in the class just above. 

When it is desired simply to halve or combine the class intervals 
of a simple frequency distribution, the work may be done by 
direct division or addition more easily than by the use of a 
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Table 17. — Illttstkating Change of Class Intbevals of Table 15, bt 
Use of Cumulative Feequenct Cueve 


Grades, per cent 

Accumulative frequency 

Students 

60-64 

1 

1 

65-69 

5 

4 

70-74 

10 

5 

75-79 

18 

8 

80-84 

29 

11 

85-89 

37 

8 

90-94 

40 

3 

95-99 

41 

1 

Total 


41 


cumulative curve Thus, if in Table 15 the intervals are to be 
halved, the frequencies in each interval are also halved cor- 
respondingly. It is often desirable, however, to modify this 


Y 



Fig 12 — Ogive m terms of percentage frequencies. 

method somewhat by allowing for the shape of the curve For 
example, if the curve is rising m the interval, more of the fre- 
quencies may be placed in the upper than m the lower subdivision 
of the interval. 

Percentage frequencies are often substituted for absolute 
frequencies on the ogive. Figure 12 is the same as Fig. 11 except 
for this change. From it we read on the Y axis that 50 per cent 
of the students made a grade of less than 82 on the X axis, 
approximately; 75 per cent make less than a grade of about 87; 
and so on. The readings can be more accurate if finely ruled 
coordinate paper is used. 
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Values may be accumulated on both scales, X and F, and 
expressed as percentages of their respective totals. This has 
been done in Table 18 and Fig 13. Each pair of accumulative 
percentages determines a point, and they are called the 
coordinates of the point For example, the first two accumula- 
tive percentages in the table furnish the coordinates (6.6, 0 2), 
the one on the left (6 6) being an X value and the one on the 
right (0 2) a F value. The point is located on the chart by going 
a distance of 6 6 percentage units from 0 along the X axis, and 
then perpendicularly up a distance of 0 2F percentage units 


Table 18 — Number of Farms by Size, Kansas, 1930* 


Size of 
farm, acres 

Number 
of farms 

Total 

acreage 

Per cent 

Accumulated 
per cent 

Farms 

Acres 

Farms 

Acres 

Under 20 

11,004 

86,739 

6 6 


6 6 

0 2 

20- 49 

9,264 

312,710 

5 6 

■H 

12 2 

0 9 

50- 99 

19,226 

1,475,364 

11 6 

3 1 

23 8 

4 0 

100- 174 

42,920 

6,319,557 

25 8 

13 5 

49 6 

17 5 

175- 259 

25,481 

5,565,698 

15 4 

11 8 

65 0 

29 3 

260- 499 

38,385 

13,796,240 

23 1 

29 4 

88 1 

58 7 

500- 999 

15,055 

10,243,252 

9 1 

21 8 

97 2 

80 5 

1,000-4,999 

4,487 

7,184,515 

2 7 

15 3 

99 9 

95 8 

5,000 and over 

220 

1,991,572 

0 1 

4 2 


100 0 

Total 

166,042 

46,975,647 

100 0 

100 0 




* Adapted from Fifteenth Census of the Umted States, Bureau of the Census 


The resulting curve is called the Lorenz curve. From it we can 
see that 50 per cent of the farms, i e., the small farms (reading 
from the left on the X axis) include 18 per cent of the total farm 
acreage (reading from the bottom on the F axis); that about 
100 — 65 = 35 per cent of the farms, i.e., the large farms (read- 
ing from the right on the X axis) include 100 — 30 == 70 per cent 
of the total farm acreage (reading from the top on the F axis) ; 
and so on. 

Further uses of the cumulative curve for computation will be 
shown later, under the topic of partition values (Chap. VIII). 

2. Graphs of Time Series. — Statistical data often take the 
form of a tiTne series, rather than of a frequency distribution. A 
time series is a set of values of a variable that correspond to 
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certain time intervals, such as years or months. For example, 
the populations of a state in 1920 and again m 1930 are a very 
brief time series (see Fig 14) 

In plotting the increase of one variable, e g , the population 
of a state, in terms of a second variable, e.g , years, it is often 


Y 



Fig 13. — ^Lorenz curve, for Table 18 



of more interest to show the proportionate increase than the 
absolute increase. For example, if a population of 3.0 millions 
increases to 3 5 millions in 10 years, the increase is much less 
impressive than when a population of 0.2 milhon increases to 
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0.7 million in the same period. Yet if the absolute increase is 
plotted, this difference will not appear, as may be seen from Fig. 
14, where the two growth lines are exactly parallel. To meet 
this objection, the percentage increase may be plotted. The 
growth from 3 0 to 3.5 millions is a percentage increase of 17, 
that from 0.2 to 0.7 million is a percentage increase of 250. This 
is shown in Fig. 15, where the Ime representing the growth 
of the population of 0 2 million is much steeper than that repre- 
senting the growth of the population of 3 0 million In makmg 
Fig. 15, the rate of growth in terms of the initial population is 
required. Instead of going to the trouble of computmg these 
rates, much the same results may be accomplished by plotting 



Year 


Fig. 15. — Population growth m terms of percentage increase. 

the absolute figures on a semilogarithmic scale. The latter 
method is usually preferred to the former, because semilogarithmic 
paper can be obtained at small cost, and the use of it saves much 
labor. 

Figure 16 shows the above population figures plotted directly 
on semilogarithmic paper. 

In Fig 16, notice that the increase in population from 0 2 to 
0.7 mi l l ion is agam represented by a much steeper line than is 
the increase from 3 0 to 3 5 millions. 

While the semilogarithmic scale does not show in strictly 
accurate proportion one to another all percentage changes, it 
represents equal percentage changes by equal slopes, and saves 
much labor compared with percentage charts such as that shown 
in Fig. 15. 

In using semilogarithmic paper, the repeated series of values 
1 to 9 usually printed on the vertical scale may be multiplied by 
any constant, provided the constant is applied to the whole scale. 
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Thus, in Fig. 16 the scale may be multiplied, say, by 7, by 0.5, 
or by any other number, when thereby it will be made more 
convenient for the plotting of particular data. A semdlogarith- 
mic scale cannot contain a zero value. 



In all graphic representation of data, the shape of the curve 
or figure is affected by the ratio of the X and Y scales. Since 
this ratio is usually a matter of arbitrary choice, advantage is 
sometimes taken of the opportunity to produce certain desired 
impressions. Figures 17 through 19 from Table 19 illustrate 
only three of many possibilities. 
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Table 19 — ^Inckbase in Enrollment of the Blank Military Academy, 

192S-1938 


Year 

Enrollment 

Year 

Enrollment 

1928 

no 

1934 

118 

1929 

113 

1935 

120 

1930 

114 

1936 

122 

1931 

113 

1937 

127 

1932 

118 

1938 

130 

1933 

119 




In Fig 17, a rather moderate increase in enrollment is made 
impressive by (1) using a large single-unit spacing on the Y scale, 
(2) starting the increase from the base (bottom) Une, and so 
avoiding any comparison between the amount of increase and the 
original volume of enrollment, (3) showing each year’s increase as 
a percentage of the enrollment in 1928, instead of as a percentage 


Y 



Fig 17 — Graph of data of Table 19 

of the enrollment of the preceding year. Figure 18 removes 
criticism (2) above, and avoids criticism (3) by using absolute 
enrollment figures instead of percentages of increase relative 
to the total enrollment in 1928. Figure 18 is still open to 
criticism (1) above, because the ratio of the X and Y units is 
not changed. In fact, the X and Y units are different in nature, 
so that it is impossible to say when one bears a just relation to 
the other. 

Figure 19 meets criticism (3) by plotting the enrollment 
figures on a semilogarithmic scale. The total enrollment is not 
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entirely pictured in the diagram because the semilogarithmic 
scale begins at 1 instead of at 0, but this is a minor matter. 
Evidently, the growth of the school makes a much poorer showing 
in Fig. 19 than m either Fig. 17 or Fig. 18 Probably Fig 19 
gives the most realistic picture of the facts in this particular case. 



Year 

Fig is — A bsolute increase in enrollment of tbe blank military academy, 

1928-1938. 


3. Miscellaneous Graphs. — common device for the graphic 
comparison of amounts or percentages is the bar chart, either 
upright or horizontal. The histogram of Fig. 7 above can be 
regarded as essentially an upright bar chart. Figure 20 shows a 
horizontal bar chart applied to Table 20. 


Table 20, — Percentage op Females 15-44 Years op Age Married, 


Selected European Countries* 


Country 

Bulgaria 

England and Wales. . . . 

France . ... 

Germany 

Italy 

Sweden 


Percentage of 
Females Married 
67 0 
48 5 
, . 57 1 

48 4 
48 4 
42 3 


* Adapted from W S Thompson, PoptiZo^ion ProfeZerTW, 2d ed , p 104, McGraw-HiU Book 


Company, New York, 1935. 
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1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 193d 
Year 

Fig. 19. — ^Eate of increase in enrollment of the blank military academy, 1928- 

1938 


Bulgcirioi 

England and Wales j 

France 

Germanv 

Ifaly 

Sweden 


0 10 20 30 40 50 60 70 

Percentage of females married 

Fig. 20 — ^Percentages of females 16-44 years of age married in selected European 
countries {From W, S, Thompson, op. cU , p, 104.) 



GRAP38 


89 


Two variations of tlie bar chart are seen in Figs. 21 and 22. 
Instead of the bar chart, comparisons are often made m terms 
of the areas of squares or circles, or of the volumes of cubes or 
spheres, as in Figs. 23 and 24. These devices, however, force the 


886 % 


97% 17% 



Whift 


Negro 07her 


Fig, 21. — Percentage of the population of the United States represented by 
each race, 1930 (Adapted from R Clyde WhiiCt Social Statistical p. 178, Harper 
& Brothers^ New York, 1933 ) 


1910 

1923 


66,3% 

l.y A Negro 37^ 


1 Negro 74 6% 


yfe 742% 


Fig 22. — Percentage of white and Negro races among the commitments 
to prisons and reformatories, 1910 and 1923. (From R, Clyde Whie, op cit , 
P 179.) 



Fig. 23, — Ratio of native 
born to foreign born in City 
X. 1930. 


Fig. 24 — Ratio of native 
born to forfeign born in City 
X* 1930. 



Fig. 26. — World distribution of telephones. {Adapted from G, R Davies 
and Dale Yoder ^ Business Statistics^ p, 40, John Wiley & Sons, Inc , New York, 
1937) 

eye to perform the rather difficult feat of measuring two or even 
three dimensions simultaneously. 

The so-called “pie chart," pictured in Fig. 25, is convenient 
for showing how a whole is subdivided. 
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MARCH 1929 MARCH 1931 

Fig 26 — Estimated unemployment, Umted States, Marcli, 1929, and March, 
1931 {Adapted from On Rdief^ Federal Emergency Relief Administration, 
Chart IX ) 



Fig 27 — Percentage of unemployment relief expenditure paid locally, state of 
Wisconsm, year endmg Sept 1, 1934. 
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More realistic and striking than any of the preceding devices 
are pictograms, of which Fig 26 is an example. 

Maps are treated m many ingenious ways for statistical 
purposes. Crosshatchmg (see Fig 27), the insertion of picto- 
grams, and spotting are common devices. 

In any attempt to present statistical figures in graphic form, 
the following two principles are to be kept in mind. (1) The 
graph should be more quickly and easily comprehended than the 
same data in tabular or nongraphic form. Graphs are sometimes 
so complex or ingenious that they can be deciphered only with 
the aid of the textual and tabular material that they are intended 
to clarify. (2) The graph should not misrepresent or exaggerate 
the facts. 


Exercises 

1. The following figures taken from the Fifteenth Census of the United 
States represent the growth in the population of Milwaukee: 


1930 

578,249 


115,587 . 

1920 

457,147 


71,440 

1910 

373,857 


45,246 

1900 

285,315 


20,061 

1890 

204,468 


1,712 


Show these data graphically 

2. Suppose that you grade the behavior of a group of juvemle deim- 
quents in a reform school and want to post a weekly chart showing the 
standing of each dehnquent. Descnbe bnefly the kmd of chart you 
would use. 

3. The charts on page 92 show how the numbers of the insane, epilep- 
tic, and feeble-minded persons in state institutions and the prison 
population in a certam state have increased in the last 25 years. 

Have you any criticism of these charts? 

4. Plot the distribution shown below as a frequency polygon. Show 
that the area under the polygon is equal to the total frequency, but that 
the area in some intervals is not equal to the frequency in those intervals. 

Distbibution of 106 Emiplotees by Age Class 


Age of Employee, Years Employees 

15-24 14 

25-34 49 

35-44 23 

45-54 13 

55-64 6 

65-74 1 
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6. Rearrange the frequencies of question 4 in class intervals of 15-18, 
19-22, 23-26, and so on. 

6. Plot the data of Problem 4 as an ogive, and read off the age below 
which 75 per cent of the employees fall. 

7. Devise a problem for which a Lorenz curve is suitable, graph the 
curve from your data, and show its use. 

8. Plot the following data in such a way that (a) it gives an unbiased 
picture of the rate of change, (6) it exaggerates the impression of the 
rate of change. 
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Population op Madison, 

Wis, 1890-1940* 

Year 

Population 

1890 

13,426 

1900 

19,164 

1910 

25,531 

1920 

38,378 

1930 

57,899 

1940 

66,802 


* From the Fifteenth Censna of the Umted States, Bureau of the Census 
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CHAPTER VII 

AVERAGES AND RATES 


1. The Need for an Average. — ^An investigator is interested, let 
us say, in the height of the residents of a certain Swiss com- 
munity m the Umted States, on the theory that they are taUer 
than their relatives in the old country Unable to measure 
the whole community of over 4,000 persons, he takes a random 
sample of, say, 182 adult males, and gets their heights as accu- 
rately as possible. He then finds himself with 182 individual 
measurements. What will he do with them? He may perhaps 
first arrange them in order of magmtude, to form an array 
If no two of the measurements happen to be identical, he will 
still have 182 different measurements In any case, it will be 
impossible for him to hold in mind all the separate values, and 
he will feel the need of some one figure by which to represent 
them. This need will be still greater when he attempts to 
determine whether or not the American group is taller than a 
similar group in the Old World, because some of the former will 
be taller than some of the latter, and vice versa. In his search 
for a single figure by which to represent the many, he will cer- 
tainly arrive at the idea of calculatmg an average 

2. The Mode. — The simplest form of average is the mode {Mo), 
which is merely the value in a series that occurs most often. 
If the heights are all different, there can be no mode in ungrouped 
data If some persons are of the same height, however, a mode 
may occur in our array. We then choose as the mode the height 
that occurs the greatest number of times. For example, in the 
followmg array of the heights of 10 European Swiss males, 4 ft. 
11 in , 5 ft. 3 in , 5 ft. 7 m , 5 ft. 8 in , 5 ft 9 m , 5 ft. 9 in., 5 ft. 
10 in , 5 ft. 11 m , 6 ft. 0 in , the mode is 5 ft 9 in. , but of course 
the sample is too small to give much information about the 
modal height of European Swiss males in general. WTiether or 
not a mode is convincing depends on how conspicuously the 
modal height stands out above the others in frequency of occur- 
rence. If the height 5 ft, 7 in. occurs 10 times and the height 5 ft. 

94 
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7 3 in. occurs nine times, it is not certain that one is significantly 
more frequent than the other. 

The situation becomes clearer if we decide to overlook shght 
differences in height, and combine our measurements m care- 
fully chosen class mtervals, as in Table 21. K the distribution 
is rather regular, we may by inspection then determine whether 

60 

«50 
cs 

E40 
"^30 

<u 

1 20 

2 10 
0 

45 50 55 60 65 70 75 50 
Height in inches 

Fig 28 — Histogram of biraodal 
frequency distribution of Table 21, 
col (2). 

often helps to plot the data in the form of a frequency polygon 
or histogram, e g., Fig. 28 

Table 21. — Heights of 165 and 182 Amebic an Males of Swiss Descent 


Height, inches 

Males 

(1) 

(2) 

45-49 

2 

2 

50-54 

10 

10 

55-59 

21 

55 

60-64 

55 

21 

65-69 

40 

57 

70-74 

32 

32 

75-79 

5 

5 

Total 

j 165 

182 


Determination of the exact modal value in grouped data is 
complex, and cannot be treated here. Several rough methods 
of interpolating within the modal class are available, such as that 
of formula (1). 

^ A is the capital Greek letter Delta, 



or not any one class interval has 
a sufficiently larger frequency 
than any other to be confidently 
regarded as the modal class. If 
so, that is usually all we need 
to know. In Table 21, col. (1), 
the modal interval is evidently 
60 to 64 inches. In col. (2), 
the distribution has two modes, 
i e.j it is bimodal, suggesting 
that it may contain both males 
and females. In such cases, it 


( 1 )^ 
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where L is the lower limit of the modal class, Ai is the difference 
(disregardmg signs) between the frequency of the modal class 
and the frequency of the class just below the modal class on the 
scale, A 2 is the difference (disregardmg signs) between the 
frequency of the modal class and the frequency of the class just 
above the modal class, and i is the size of the modal class interval. 
Applying this formula to the distribution of Table 21, col (1), 
we find the crude mode, 


ikfo = 60 4- 


(55 - 21) (5) 

(55 - 21) + (55 - 40) 


Mo = 63.5. 


Another approximate method of finding the mode of a fre- 
quency distribution is provided by formula (2) : 

ikfo = Af - 3(Af - Md). (2)1 

where M is the arithmetic mean of the distribution, and Md is the 
median, as described below. Assuming that for the distribution 
of Table 21, coL (1), M == 64.68, and Md = 64 5, formula (2) 
gives for the crude mode: 


Mo = 64 68 - 3(64.68 - 64.5). 
Mo = 64.1. 


This value is a little different from that found by formula (1) 
Mention of the conditions under which formula (2) is appropri- 
ate is made in the Sec 5, below 
3. The Median. — Suppose that we have the following array 
of the heights of 11 American adult males of Swiss descent: 


Table 22. — Heights of 11 
Male 
1 
2 

3 

4 

5 

6 

7 

8 
9 

10 
11 

^ See derivation of formula (54), 


Adtjlt Males of Swiss Descent 
Height, mches 
68 

68 5 

69 

69 5 

70 

71 

71 5 

72 

72 5 

73 

74 


Chap IX. 


American 
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A quick way of getting some idea of an average for this series 
would be to note the height that stands at the middle of the 
series. This is seen to be 71 inches, or height number 6 m rank 
order. This kind of average is called the median, which is 
defined as the middle value, or that value which is exceeded by 
as many values as it exceeds. 

Now if a twelfth person of, say, height 75 inches is added to the 
above group, a diflieulty arises. There is no middle value. 
Unless we are wfiling to take the mean height of the sixth and 


seventh persons 


/71 + 71 5 

V 2 


= 7125 


) 


as the median, we must 


say that there is none- Although the median so found, 71 25 
inches, is a height that does not actually appear in the series, 
it is customary for most purposes to accept it as the median. 

Consider another common case. Let the height of the fifth 
person in the first group of 11 persons be 71 inches. Again, 
strictly speaking, there can be no median that meets the defini- 
tion, because there are no longer as many heights below the 
middle height as there are above it As before, a compromise 
is commonly made by taking the middle value (71 inches) as the 
median 

From the above, it will be noticed that the formula for locating 
the median value in an ungrouped series is to add one to the 
number of values and divide by two: 


A^ + 1 
2 


(3) 


Thus, above, where N = 11, the position of the median value is 
= ^ = 6, or the value in position 6, and where N = 12, 

1 O _1_ 1 1 Q 

— — = — = 6.5, the median value is the height in position 

6 5, which can only be the mean of the heights in positions 6 and 7. 

The above relates to ungrouped data. When the items of a 
series are grouped in class intervals, the median is regarded 
as the value on the X scale that divides the area of the frequency 
histogram or curve into two equal parts, as shown in Fig 28. 
Thus, in Table 21, col (2), N/2 = 182/2 = 91. Now, 88 fre- 
quencies fall below the class limit, 65 inches, so that 91 — 88 = 8 
frequencies fall inside the interval 65 to 69. Since there are 57 
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frequencies in this mterval, and the width of the interval is 
5 inches, the median falls X 5 = 0.263 inch inside the interval, 
or at the pomt 65 + .263 = 65 263 inches on the X scale. The 
area below the median is then 2 + 10 + 55 + 21 + 

4 737 

(57) = 91, that above the median is - g ■ (57) + 32 + 5 = 91, 
and the two areas are equal. 

The simplest way to find the median of grouped data is as 
follows. Accumulate the frequencies, as in the last column of 
Table 23 Divide N by 2: 165/2 = 82.5. Look down the 
column of accumulated frequencies until the frequency in the 
position 82 5 is found, in the interval 60-64. From 82.5 sub- 
tract the accumulated number of frequencies below the median 
interval: 82 5 — 33 = 49.5. Multiply the width of the class 
mterval by the fraction 49 5/55, formed by the difference just 
found as numerator and the frequency of the median interval 
as the denominator: 5 X 49 5/55 = 4.5. Add this quotient to 
the lower limit of the median interval: 60 + 4 5 = 64.5. This 
is the median height for the table. 

Table 23 — Height op 165 Ambeican Adult Males op Swiss Descent 


Height, inches 

Males 

1 

Number 

Accumulated number 

45-49 

2 

2 

60-54 

10 

12 

55-59 

21 

33 

60-64 

55 

88 

65-69 

40 

128 

70-74 

32 

160 

75-79 

5 

165 

Total 

165 



We can express the above steps by means of a formula, which 
is applicable to frequency distributions: 

Md=L + 



( 4 ) 
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•where L is the lower limit of the class interval in which the 
median falls, F is the number of accumulated frequencies that fall 
below (^ e., m class intervals with limits smaller than those of) the 
median class interval, f is the number of frequencies in the 
median class interval, i is the size of the median class interval, 
and N is the total frequency of table. N/2 is first found, and 
then the remaining sjnmbols can be evaluated and substituted 
in the formula, as indicated in the preceding paragraph. Thus, 

33 ^ 


/ 16 5 _ oq\ 

for the problem above, Md == 60 + I — ) 5 = 64.5, 


as 


before. 

4. The Arithmetic Mean. — The arithmetic mean, M, is the 
type of average that is most often used. It is the sum of the 
X values divided by their number, N: 



(5)* 


For example, in the case of the ungrouped values, 3, 7, 2, 12, 
1, 16, 4, representing the numbers of children in seven Italian 
immig rant famihes, their sum is 45, and there are seven of them, 
so that M = 45/7 = 6 43. 

If some of the above values had occurred more than once, we 
might have 

X X {Continued) 

1 4 

2 7 

2 7 

3 7 

3 12 

3 12 

4 16 

4 _16 

4 111 

4 

If = ^ = 6 17 

But, as shown in Chap V, this long array may be condensed* 


The Greek letter, S, capital Sigma, means to sum, or add, tke X values. 
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Table 24. — Ntolbeb of Childben in 18 Italian Immigrant Families 


CMdren 

(X) 

Families 

(/)* 

/Xt 

1 

1 

1 

2 

2 

4 

3 

3 

9 

4 

5 

20 

7 

3 

21 

12 

2 

24 

16 

2 

32 

Total 

18 

111 


* Frequency 

t Frequency multiplied by X. 


In the case of grouped data, it is more convenient to write 
formula (5) in the form 


M = 


2/X 

IT’ 


( 6 ) 


where / is the frequency 

Substituting in formula (6) iV = 18 and the total of the third 
column of the array just above, 


^ ~ “ 6.17, 

as before. 

Formula (6) may be applied to any frequency distribution, e g , 
that of Table 25. 


Table 25 — Height of 165 American Adult Males of Swiss Descent 


Height 

Adult males 

Inches 

X* 

/ 

/X 

45-49 

47 5 

2 

95 0 

50-54 

52 5 

10 

525 0 

55-59 

57 5 

21 

1207 5 

60-64 

62 5 

55 

3437 5 

65-69 

67 5 

40 

2700 0 

70-74 

72 5 

32 

2320 0 

75-79 

77 5 

5 

387 5 

Total 

1 

165 

10,672 5 


M = 


10,672.5 

165 


64.68. 


* Mid-pomts 
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As pointed out earlier, tte mean calculated from a frequency 
table in which the rmd-pomts are not identical with the means 
within the intervals is, of course, somewhat inaccurate, as is 
any other average or statistic found from such a table. 

It IS possible to simplify the calculations needed to find the 
arithmetic mean (usually called simply the mean) in a frequency 
distribution such as that of Table 25. Suppose that the mid- 
pomts of any distribution are Xi, X 2 , X 3 , etc. They can be 


Y 


O 



A Xi X2 etc 




Tig 29. — Diagram used m derivation of formula (13) 


represented by the above diagram, where /i is the frequency 
in the Zi interval, etc., measured along the F axis. 

By formula (6), 



But suppose that we choose to measure the X values from some 
arbitrarily assumed or “guessed” poiat on the X axis, say A, 
in Fig 29. Then 

Zi = A + Zi', (7) 

Zs = A + Z/, 


where the Z"s represent the distances of the Z’s measured from 

A. 

If we further choose to reduce the size of the Z”s by dividing 
them by the size of the class interval, %, or other constant, we have 



% 


*■" dj, etc . , 


( 8 ) 


or 

Z/ = U, (9) 

X4 = dii, etc 

Substitutmg the values of Zi' and Zz' from (9) in (7), 

Zi = A + dll, 

Z 2 = A + dii, etc. 


( 10 ) 
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Substituting from (10) in (6), 



Ef{A + dt) 

^ N 

^ N ^ N 

(11) 

Constants can always be placed outside the summation sign/ so 

that 

AEf im 
^ N ^ N 


Now 

II 


So that 

II 

+ 


or 

M-A+ 

(13) 


Let us apply formula (13) to find the mean of the frequency 
distribution of Table 26' 


Table 26 — Height op 166 American Adult Males op Swiss Descent 


X* 

X' = Z - A 

t 

/ 

fd 

47 5 

-15 

-3 

2 

- 6 

52 5 

-10 

-2 

10 

-20 

57 5 

- 5 

-1 

21 

-21 

62 5 

0 

0 

55 

0 

67 5 

+ 5 

+1 

40 

+40 

72 5 

+10 

+2 

32 

+64 

77 5 

+15 

+3 

5 

+15 

Total .. . 



165 

+72 


* Mid-pomta 


In the above table, by arbitrary choice, the assumed mean is 

A = 62.5. 
i = 5. 

S/d = 72. 

N = 165. 

1 Notice the principle that 

r(Z + n = SZ + SF 


( 12 ) 
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Therefore, substituting in formula (13), 


M = 62 5 + 


5(72) 


(165)’ 
M = 62.5 + 5( 436), 
M = 62.5 + 2.18, 

M = 64.68, 


which is the same as we found the mean to be by the “long” 
method The calculations required in Table 26 are greatly 
reduced compared with those in Table 25. 

The second column of Table 26 is inserted for explanatory 
purposes only, and is omitted except when irregular class intervals 
cause difficulties. Table 27 illustrates the usual form for 
computation 


Table 27 — Numbee of Reliep Cases pee Block m a Slum Aeea op a 

CiTT 


X 

d 

/ 

fd 

1 5 

-3 

4 

-12 

3 5 

-2 

10 

-20 

5 5 

-1 

14 

-14 

7 5 

0 

26 

0 

9 5 

+1 

19 

+19 

11 5 

+2 

14 

+28 

13 5 

+3 

8 

+24 

15 5 

+4 

4 

+16 

17.5 

+5 


+ 5 

Total 



+46 

ikf = 7.5 + 2 

(^) - 


Notice that it makes no difference in the result where the 
d = 0 — i.e , the assumed mean — ^is placed. A good way to check 
the work is to perform the calculations from two different 
assumed means. 

Formula (13) also holds for irregular class intervals if *, 
which may be any convenient divisor as well as the size of the 
class interval, is held constant. This may be illustrated below. 

Consider this table of age distribution for Wisconsin, from the 
1930 census. 
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Table 28. — Age Distribiftion of Poptjlation, Wisconsin, 1930 


Age 

Per 

cent/ 

X* 

X - A 

X - A , 

6 

fd 

Accumu- 
lated /t 

Under 5 

9 2 

2 5 

-20 0 

-4 0 

-36 8 

9 2 

5-9 . 

9 9 

7 5 

-15 0 

-3 0 

-29 7 

19 1 

10-14 . 

9 7 

12 5 

-10 0 

-2 0 

-19 4 

28 8 

15-19 .. 

9 2 

17 5 

- 5 0 

-1 0 

-92 

38 0 

20-24 

8 3 

22 5 

0 

0 

0 

46 3 

25-29 

7 7 

27 5 

+ 50 

+1 0 

+ 77 

54 0 

30-34 

7 4 

32 5 

+10 0 

+2 0 

+14 8 

61 4 

35-44 

14 0 

40 0 

+17 5 

1 +3 5 

+49 0 

75 4 

45-54 

1 10 6 

50 0 

+27 5 

1 +5 5 

+58 3 

86 0 

55-64 

' 7.2 

60 0 

+37 5 

1 +7 5 

+54 0 

93 2 

65-74 

4 6 

70 0 

+47 5 

+9 5 

+43 7 

97 8 

75 and over 

2 0 

? 

? 

? 

? 

99 8 

Total 

99 8 




132 4 

1 


* Mid-pomts. The census records age m whole years, as of the last birthday. But since 
the actual ages are not discrete, age should be treated as continuous Otherwise, all aver- 
ages will be too low 

t Accumulated frequencies. 

Finding the mean for the table below age 75^ by substituting 
in formula (13), 


M = 22 5 + 5 = 22 5 + 5(1 354) 

M = 29 27 


In such a table as this, with an open interval, only the median 
and mode can be found for the total table Why? What is 
the median value for Table 28 

6. Interpretation of the Common Averages. — The arithmetic 
mean, M, is the most famihar type of average It is amenable 
to algebraic operations which cannot be applied to the median 
or mode. Suppose we know that the mean of one distribution of 
50 items is 4, and the mean of a second comparable distribu- 
tion of 75 items is 6 Then the mean of both distributions is 


(4 X 50) + (6 X 75) 
50-h75 


The only accurate way of finding 


1 Tbe mean, of course, cannot be found for the table mcludmg the open 
iaterval, “75 and over,'^ because no mid-pomt can be assigned to an open 
interval. 
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the median of the total distribution is actually to combine the 
distributions, interval by interval, and recompute the median 
and mode for the combined distribution, just as was done for the 
separate distributions. If there are several medians given, it is 
possible to find the median median, but it is not likely to he the 
same as the median of the combined distributions. Although 
the mean of two or more medians is sometimes used, the meaning 
of such a combination of averages is not clear. A correct total 
cannot be obtained by multiplying the median by the number of 
items” in a distribution.^ 

A second characteristic of the mean is that it alone of the 
three averages reflects the exact value of every item. If extreme 
values occur in a series, they affect the mean much more than 
the median or the mode, because the median is affected only 
by the circumstance that an item is greater or smaller than the 
median item — the amount of the difference being of no conse- 
quence — and the mode is affected only by whether or not the 
size of a value throws it into one class interval or another. Con- 
sider the series of ages in years, 2, 4, 7, 10, 13, 15, 19. M = 10, 
Md =10. If the three items that are larger than 10 are replaced by 
three others also larger than 10, the Md stays the same, but the 
M changes. Thus for 2, 4, 7, 10, 58, 70, 80, M = 33, Md == 10. 
This is sometimes an advantage of the mean, and sometimes a 
disadvantage. If the extreme values are regarded as at 3 rpical 
of the series, the median will be a better average than the mean, 
because the median is less influenced by such values. If, on 
the other hand, the extreme values are thought to be an integral 
part of the series and to deserve full weight, then the mean is 
more appropriate than the median. In series where the mean 
seems inappropriate, it is often advisable to question the repre- 
sentativeness of any average, and to drop the atypical items. 

A third important trait of the mean is that it usually changes 
less than the other two averages, from sample to sample. 
Suppose that the I.Q.^s of the first 100 students met on a college 
campus are taken and the M, Md, and Mo of these I.Q 's are 
computed. The same thing is done with a second hundred stu- 
dents, a third, and so on. Then the differences between the means 
of the several samples will generally be less than the differences 

1 WiLFOKO I King, The Elements of StaUstical Method, p. 131, The Mac- 
mill an Company, New York, 1918. 
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between tbe medians or the modes. This sampling stability of 
the mean is very much in its favor 
For such reasons as the above, in the averaging of measure- 
ments the arithmetic mean is always to be preferred to the 
median or mode unless it is felt to be much less representative 

of the series than they are, or 
unless, because of open-end class 
intervals, the mean cannot be 
calculated. When we have to deal 
with a series of ranked items, rather 
than measured values, however, 
only the median applies. 

A frequency distribution is exact- 
ly balanced along the perpendicu- 
lar erected at the mean. The sum 
of the deviations of a series of 
values from their mean with regard 
Number of heads for signs, i e., the algebraic sum, is 
Fiq 30 — Graph of s^met- always zero. This is not true of the 
Table 29. Other averages except m perfectly 

symmetrical distributions, where 
the mean, median, and mode all coincide (see Fig. 30), A dis- 
tribution is symmetrical when equal frequencies occur at equal 
distances above and below the mean, as in Table 29 and Fig 30. 




On the other hand, when signs are disregarded, the sum of the 
deviations is least in the case of the median. 
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Typically, in distributions that are not symmetrical or bell- 
shaped, but skewed, i e , extending farther on one side thgti on 
the other, the mean is pulled farthest in the direction of the 
skewness (because of its sensitiveness to extreme values), the 
mode is nearest the end of the scale opposite the direction of 
the skewness, and the median falls somewhere in between the 
other two (see Fig. 31). Indeed, in moderately skewed dis- 
tributions, the median is generally about one-third of the dis- 
tance from the mean to the mode, a fact utilized in formula (2) 
above. If the three averages are calculated for the skew dis- 
tribution of Table 28 below age 75, using formula (2) for the mode, 
they will be found to fall in this way (Jkf = 29.27; Md = 27.34; 
Mo = 23.48). 



Fig. 31 — Skewed frequency distributions. 


The usefulness of any average usually depends upon how 
representative it is of its distribution or series, i.e , upon what 
proportion of the items in the series is close to the average. 
Although it is mathematically possible to calculate the mean, 
median, or mode for any series, the concept of the average as a 
value representative of the series has much more validity in the 
case of some series than of others. It is most valid for symmet- 
rical distributions, and least valid for distributions shaped like 
the letter J (or reversed J), or the letter U, illustrated in Table 
30, cols. (2) and (3), respectively, and Figs. 33, 34. In the case of 
J and U shaped distributions, any average is likely to conceal more 
important information than it reveals, and for this reason it is 
usually advisable not to compute averages for distributions of 


Table 30 — Age Distkibutions (Hypothetical Data) 


Years of age 

(1) 

f 

(2) 

/ 

(3) 

f 

0-4 9 

21 

18 

116 

5-9 9 

53 

21 

53 

10-14 9 

116 

47 

18 

15-19 9 

47 

53 

47 

20-24 9 

18 

116 

132 
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such extreme types. Perhaps the mode is the best of the 
three averages in situations of this kind; but even its value is 
questionable. 

Enlarging on the last point, special precaution is necessary to 
avoid the use of averages to represent a group that varies widely 
within itself. Thus a single infant mortahty rate for a county 
containing a large city and a rural area in which the rates are 
very different is likely to be not only meaningless, but misleadmg. 
This point must be kept constantly in mind in most statistical 
problems, e g , the calculation of a correlation coefficient. The 
latter, which is an average, may indicate a moderate amount of 



0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 


X- Years of age X^Years of age X=Years of age 

Fig, 32 Fig 33 Fig 34 

Fig. 32 — Graph of roughly symmetrical frequency distribution of Table 30, 
Col. (1). 

Fig 33 — Graph of J-shaped frequency distribution of Table 30, Col (2) 

Fig 34 — Graph of tJ-shaped frequency distribution of Table 30, Col. (3). 

relationship over the whole table, whereas actually there is no 
relationship at one end of the table and a close relationship at 
the other (see Chap. X). 

It should be noticed that an average, usually the mean, may 
sometimes legitimately be used for the purpose of resolving a 
series of values into a single composite value, whether the latter is 
'^representative’^ of the values in the series or not. This is the 
case when the chief interest lies merely in comparing the com- 
posite values of two or more series, as the mean size of income of 
all workers with the mean size of income of unskilled laborers 
alone. 

In most cases, it is important to exhibit the table of the fre- 
quency distribution as a whole, so that the distribution of the 
items, as well as their averages, may be known to the reader. 
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It is also a practice in doubtful cases to present all three averages 
side by side, so that their differences may be seen. This, how- 
ever, may merely throw upon the reader the responsibility of 
choosing an average. 

6. The Geometric IVIean* — In averaging a series of numbers 
that bear an approximately constant ratio to one another, like 
2, 4, 8, 16, none of the three averages described above is as 
appropriate as the geometric mean. The geometric mean is 
used to average any series in which changes are expressed as 
rates rather than as absolute differences. It is also preferable 
for averaging some skewed distributions, since it gives less 
weight to extreme variations than does the arithmetic mean. 

The geometric mean is always smaller than the corresponding 
arithmetic mean. When a series contains a zero or negative 
value, its geometric mean cannot be found. Just as the sum 
of the plus deviations is equal to the sum of the minus devia- 
tions from the arithmetic mean, so the product of the ratios 
of the values smaller than the geometric mean to the geometne 
mean is equal to the product of the ratios of the geometric mean 
to the values larger than the geometric mean (e g., the geometric 

mean of 5, 8, 10, and 12 is 8 3, and X ^ X Also, 

corresponding to the fact that when each member of a series is 
replaced by the arithmetic mean of the series the sum of the 
series is not changed (e ^ , 3 + 7 + 6 = 15, and 5 + 5 + 5 = 15), 
so, when each member of a series is replaced by the geometric 
mean, the product remains the same (e g., 12 X 34 X 4 = 1,632, 
and 11 7735 X 11 7735 X 11.7735 = 1,632). 

For an ungrouped senes of values Xi, nj the formula 

for the geometric mean is 

G = • X 2 • • • Xn. (14) 

For grouped data, 

G - • X/^ • • * X/«, (15) 

where Xi is a mid-point and/», its exponent,^ is the corresponding 
class frequency. Computation, however, is most conveniently 

^ Exponent means the power to which X is raised, e.g.-, (X)* Here the 
exponent is 2, the second power. 
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done by means of logarithms, using the respective formulas: 

n 

logG = ^'^logX. (16) 

, n 

log 2 (17) 

n 

where IV = V /.. 

r 

To illustrate the use of formula (16), the geometric mean of 
the rates in col. (4) of Table 32 below is found from a table of 
loganthms.i 

log Q = 4-(log 0 015 + log 0.058 + log 0 061 + log 0 047 

+ log 0.029 + log 0 011 + log 0 001) 
= f (8.17609 - 10 + 8.76343 - 10 + 8 78533 - 10 
+ 8 67210 - 10 + 8 46240 - 10 + 8 04139 

- 10 + 7 00000 - 10) 

= 1(57.90074 - 70) 

= 8.27153 - 10 

G = 0 019 

Notice that the geometric mean obtained by formula (16) is un- 
weighted, i e , each rate is given equal weight The unweighted 
arithmetic mean of the same rates is 0 03171, while the weighted 
arithmetic mean rate, from cols. (2) and (3) of Table 32, is 
26,326/790,193 = 0 03331. 

The total column of the table in Exercise 1 below shows a 
skewed distribution, so that the geometric mean should be more 
representative of it than the arithmetic mean. By formula (17), 
log Q = -^^(73 log 2 + 96 log 6 + 101 log 10 + 48 log 14 

+ 52 log 18 + 21 log 22) 
= Tr5T[73(0 30103) + 96(0.77815) + 101(1) + 48(1 14613) 

+ 52(1 25527) + 21(1 34242)] 
= ^iT(21.97519 + 74 70240 + 101 00000 + 55 01424 

+ 65 27404 + 28.19082) = 0 88531 
G = 7.68 

The arithmetic mean is 9 72 and the median is 9.05. Formula 
(17) gives a weighted geometric mean, or geometric mean of a 
frequency distribution. 

* See Appendix, Table 7, and accompanying explanation 
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Notice the application of the geometric mean to the problem 
of estimating the population midway between two decennial 
censuses In Table 31 the population of the United States in 
milhons is shown at 10-year intervals from 1790 to 1940. When 


Tabus 31 — Poptilation of the United States, 1790-1940 
(In milhons) 


Year 

Population 

Year 

Population 

1790 

3 93 

1870 

38 56 

1800 

5 31 

1880 

50.16 

1810 

7 24 

1890 

62 95 

1820 

9 64 

1900 

75 99 

1830 

12 87 

1910 

91 97 

1840 

17 09 

1920 

105 71 

1850 

' 23 19 

1930 

122 78 

1860 

31 44 

1940 

131 41 (prelim.) 


these figures are plotted, we get the absolute growth curve 
shown m Fig 35. Now suppose it is wanted to estimate the 



Fig 35 — Absolute growth of population. United States, 1790-1940. 

population in 1795, midway between the censuses of 1790 and 
1800 If we take the arithmetic mean of the populations at 
1790 and at 1800, we have 


5 31 + 3 93 


4 62 nnllions 


2 
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This evidently assumes that the absolute amount of population 
increase is the same over equal periods of time, since 

4.62 - 3 93 = 0.69, 

and 5.31 — 4.62 = 0.69. From Table 31, however, we see 
that the differences, 5 31 — 3 93 = 1 38 and 7 24 — 5 31 = 1 93, 
are not equal, and this is borne out by inspection of Fig 35. 
Actually, around the dates 1790 and 1800, as far as we can judge 
from the given data, the absolute growth in population was 
increasing. Under these conditions, the growth curve between 
1790 and 1800 would probably be concave, as shown by line a 
in Fig, 36. The population in 1795 would then be somewhat 
less than that found by the method of the arithmetic mean, 



Fig 36. — ^Probable trend of population growth in the United States, 1790-1800. 

which implies a straight hne rather than a concave trend (line 6 
in Fig 36). On the simple assumption that the rate of annual 
increase was constant between 1790 and 1800, the growth curve 
will be concave, and the geometric mean will give the exact 
population in 1795. The geometric mean is, therefore, usually 
regarded as the logical average to use when the growth curve is 
concave. The formula may be written 

p = = (PoPlo)^ (18) 

where P is the population midway between the two censuses, Po 
is the population at the first census, and Pio is the population at 
the second census Substituting in this formula, 

p = \/3 93(5 31) = 4.57 millions. 

The geometric mean is not a suitable average, however, when 
the absolute amount of change is less each decade, as happened 
between 1930 and 1940 The growth curve is then convex (like 
c, Fig 36), so that both the arithmetic and the geometric means 
give too low estimates of the population midway between 
censuses. 
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If we wish to calculate the constant annual rate of population 
increase that was assumed in finding the geometric mean (= 4.57) 
above, we apply the formul a 


■■ ■ (r)* - '• (») 

If, as before, Po = 3.93, = 5 31, and n = 10, we have 

^ - (H)* - ‘ - 1- 

By logarithms, 

log (1.35)* = log 1.35 = 3^(0.13033) = 0.013033. 


So 


(1 35)* = 1.03, 


and r = 1 03 - 1.00 = 0 03. 

That is, in finding the geometric mean we assumed that the 
population increased at the average rate of about 3 per cent per 
year between 1790 and 1800. 

For the same problem, the arithmetic mean gives a rate of 


6.31 - 3.93 


= 3.5 per cent, which if assumed to be constant 


3 93(10) 

over the 10-year period would result in a population in 1800 of 


Pio = Po(l + ry^. (20)1 

Pio = 3 93(1.035)1®. 


log (1 035)1® == 10 log 1.035 = 10(0.01494) = 0.14940. 
So 

(1.035)1® = 1.411, 

and 

Pio = 3.93(1.411), 


or 


Pio = 5.55 millions, 


whereas, actually, Pio = 5.31 millions. 


^ Formulas (19) and (20) may be derived as follows: 

Let 

Po = Population of the state on Jan. 1, 1930, 

Pi «= Population of the state on Jan. 1, 1931, etc. 
r = constant annual rate of increase. 

For remainder of footnote see page 114. 
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7. Popiilation Rates. — ^The ratio of divorces to population, say 
3.2 per 1,000, is an illustration of the kind of rate that is important 
for sociologists. Other examples are the crude birth rate (births 
per 1,000 population per year) and the crime rate (say, convic- 
tions per 1,000 males 10 years old and over per year). A rate 
shows the amount of one variable per given amount of another 
variable e g , the number of births in relation to a given number 
of women of child-bearing age in a population. 

In working with population rates, such as marriage rates, 
death rates, etc , it is helpful to have in mind what is meant 
by a rate. Mathematicians define a rate as the amount of 
change in a function (dependent variable) that occurs per unit 
change in the independent variable. The rate of travel of an 
automobile is the number of miles by which its position m space 

Then 

Pi = jPo + -Po ?* == Po(l + r), 

P2 =Pi +Pir 

= Po(l + r) + Po(l + r)r 
= Po(l +r)(l +r) 

-Pod +r)2, 

Similarly, 

Pio = Po(l + r)io 

If n = number of years between censuses, 

Pn = Po(l + r)«, 
or 

(l+r)» = 
log (1 + r)“ = log 
n log (1 -b r) = log 

0 

log (1 +r) = 

1 

log(l+r)=log(5=)” 
or 

and 


1 



Pz = P2 "h 

= Po(l + tY + Po(l -{- r)V 
= Pod + r)2d + r) 

= Pod +r)3. 
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(function) changes per change of 1 hour in time (independent 
vanable). How does a marriage rate fit the mathematical 
idea of a rate? The usual form of the mamage rate is the 
number of marriages per 1,000 population per year. Here, 
however, there are three variables instead of the two mentioned 
in the mathematical definition of a rate. Which of these is the 
function, which is the independent variable, and how is the third 
variable to be mterpreted ? The time element is usually regarded 
as the independent variable in rate problems, and the factor 
that varies with time, as the function. In our example, both 
the number of marriages and the size of the population base may 
change from year to year. Either of these alone related to time 
would give a mathematical rate. But we are not interested in 
such a rate. Rather, we want to know how the ratio of marriages 
to total population changes with time. It is, then, this ratio 
that is the function in our marriage rate. 

In the case of the marriage rate, we are primarily interested 
in the annual changes in the number of marriages, and not in the 
change in the population base. The only reason for introducing 
the population base at aU is to ehminate it as a cause of change 
in the number of marriages, so that the annual change in the 
number of marriages may be comparable from one population to 
another. 

This raises an important point. Is the population base the 
only factor that needs to be eliminated or controlled in order 
that the marriage rate may mean just what we want it to? In 
order to have the mamage rate as comparable as possible from 
one population to another, should we not also control the factors 
of age and sex composition, so that their influences are removed 
from the rate? That depends on the question we want to 
answer. If our question is, which of two or more total popula- 
tions has the higher marriage ratio, regardless of the causes 
involved, we do not control age and sex in our ratio. But if 
we wish to know which of the populations would have the higher 
marriage rate if their age and sex distributions were the same, 
we must control age and sex. This leads us to the so-called 
age-specific, gross, and net marriage rates for females. In all 
such rates, we note as a general prmciple that the denominator 
or base of the final rate should ideally contain only the group 
exposed to the event (e gr , if the event is marriage, the group 
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exposed should be composed exclusively of, say, unmarried 
females), while the numerator should contam the number of 
ev&rds (e.gf., marriages) occurnng in the year. In the case of 
most crude rates, like official birth and marriage rates, this 
principle is disregarded. 

When marriage or other rates are plotted, it is usually advisa- 
ble to plot them on semilogarithmic paper, in order that the rate of 
change may be shown by the steepness of the graph. Plotting the 
rates directly on semilogarithmic paper is equivalent to plotting 
the logarithms of the rates, which in turn is similar to plotting 
the percentage of change in the rates from year to year (see Chap. 
VI, Fig. 19). 

It may be of interest to compute two of the most important of 
the refined rates used in current vital statistics. The gross 
reproduction rate is defimed as the average number of girls bom 
per woman passing through child-bearing age, say 15 to 50 years, 
without mortality, and exposed to the birth rate of a given year. 
The net reproduction rate is simply the gross reproduction rate 
corrected for mortality. In Table 32 these rates have been found 
for Wisconsin, with the year 1934 as the base The gross rate 
appears as the total of col (5), and the net rate as the total of 
col. (7) It is seen that each 1,000 women born, if none died and 
all were subjected to the average age-specific rates of 1934, would 
bear 1,110 daughters. However, if these 1,000 women were 
exposed to the death rates found in an appropriate life table, 
they would bear only 995 daughters to start the next generation. 
Since the actual distribution of women by age groups was 
eliminated as a factor in Table 32 from col. (4) on, it is possible 
that there may be a disproportionate number of young females 
in the population of Wisconsin m 1934 and that this may prevent 
the population from actually declining for a time, even though 
the net reproduction rate is less than 1. But if the net reproduc- 
tion rate of 1934 should continue until the age distribution was 
stabilized, the female population of the state would then begin to 
decrease at the rate of 5 per 1,000 per generation. As a matter 
of fact, the birth rate was unusually low in 1934 on account of 
the economic depression and has since risen somewhat The 
average birth rate over a period of, say, 3 to 5 years furnishes a 
more stable base than the rate for a single year, and for some 
purposes should be preferred. 
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Table 32. — Gboss and Net Repboditction Rates in Wisconsin, 1934 


Age 

groups 

(1) 

Females, 
15-49, 
July 1, 
1934 

(2)* 

Female 

hve 

births, 

1934 

(3)t 

Daughters 
born per 
female, 
15^9, 
1934 

m 

Average 
daughters 
bom to 
female, 
15-49, m 
5-year 
period 
(5)§ 

Female 

survival 

rates 

from 

birth 

(6)11 

Average 
daughters 
bom to a 
female 
survivmg 
to age 50 

(7)f 

15-19 

139,600 

2,147 

0 015 

0 075 

0 92512 

0 070 

20-24 

131,369 

7,599 

0 058 

0 290 

0 91480 

0 265 

25-29 

118,042 

7,256 

0 061 

0 305 

0 90117 

0 275 

30-34 

108,496 


0 047 

0 235 

0 88626 


35-39 

103,165 

2,987 

0 029 

0 145 

0 87016 

0 126 

40-44 

101,089 

1,147 

0 011 

0 055 

0 85071 


45^9 

88,432 



0 005 ' 

0 82522 


Total 

790,193 

26,326 


1 110 


0 995 


♦ Estimated from the 1930 census with the aid of a hfe table for Wisconsm. 
t Found by applying the percentage of total births, female, in 1934 to total live births, 
corrected for underregistration 
t Column (3) divided by col (2) 

§ Column (4) multiplied by 5, smce a woman m any 5-year age group is assumed to bear 
as many daughters m each of the 5 years as m 1934 

II Taken from Life Table for White Females m Wisconsm, 1929-1931, prepared by the 
Metropolitan Life Insurance Company. 

Tf Column (5) multiplied by col (6). 


Exercises 

Note* A calculating macliine will save time in solving the problems 
in this text. At least the student should own an inexpensive shde rule. 

1. a Find the crude mode, where appropriate, of each of the follow- 
ing four series, and of all four combined, using formula (1) : 


Age op Children in Four Three-genera.tion KiNsmp Groups 


Age of child, 
years 

(X)* 

1 Number of children in kinship group 

I 

Total 

children 

(/) 

1 

(/.)t 

II 

(U) 

III 

(/.) 

IV 

(h) 

2 


15 

30 

18 

73 

6 

21 

20 

29 

26 

96 

10 

33 

36 

18 

14 

101 

14 


10 

6 

12 

48 

18 

12 

7 

5 

28 

52 

22 

4 

1 

3 

13 

21 

Total 


89 i 

91 

111 

391 


* Mid-pomt. 
t Frequency 
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h. By use of formula (2), find the crude mode of the total series of 
ages 

2. a. What is the median of each of the six series below? 


Number of Persons per Broken Home 




Set III 

Set IV 

Set V 

Set VI 

3 

3 

3 

3 

3 

3 

5 

5 

5 

5 

2 

1 

4 

4 

4 

4 

4 

2 

1 

1 

2 

2 

1 

4 

6 

6 

6 

1 

5 

5 

8 

8 

8 

4 

8 

8 

2 

2 

2 

6 

11 

4 

11 i 

111 

11 

4 

5 

11 





12 



h. WTiat is the median of series IV and VI combined (added by rows) ? 
Note: These senes contain too few cases for the medians to have much 
meanmg; they are useful only for practice in finding the median. 

3. a. Calculate the median of the two frequency distributions below: 


Percentage of Churches without a Full-time Minister in the Rural 
Counties of Two Regions 


Percentage 
of churches (X*) 

Region I, 
counties (/i) 

Region II, 
counties (/ 2 ) 

Regions I and II, 
counties (/) 

2 5 

22 

4 

26 

7 5 

94 

18 

112 

12 5 

221 

26 

247 

17 5 

85 

17 

102 

22 5 

67 

25 

92 

27 5 

39 

14 

53 

Total 

528 

104 

632 


* Mid-pomt 


6. What is the median of the two distnbutions combined‘s How 
does it compare with the mean of the medians of the two separate dis- 
tnbutions ^ What is the meamng of the mean of the medians ‘S 

4. The rural counties in 15 states were scored on various points, such 
as percentage of homes with telephone, per capita expenditure for 
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schools, and so on, and the median score for the counties m each state 
was determined, giving 15 medians. It was then wanted to know the 
median score of all counties in the 15 states together. How would you 
find this‘s* 

5. In the table below, what is the arithmetic mean of {a) the popula- 
tions of the counties? (6) the birth rates? 


County 

Population 

(Xi) 

Birth rate per 
1,000 population 
(X.) 

1 

8,003 

19 5 

2 

21,054 

24 5 

3 

34,301 

21.1 

4 

15,006 

9 8 

5 

72,573 

23 1 

6 

15,330 

16 4 

7 

10,233 

17 4 

8 

16,848 

12 6 

9 

37,581 

21 2 

10 

34,165 

16 7 

11 

30,503 

19 1 

12 

16,781 

21 6 

13 

119,217 

18 3 

14 

52,745 

14 0 

15 

18,182 

18 6 

16 

46,583 

16 9 

17 

27,037 

17 3 

18 

42,565 

22 1 

19 

3,815 

15 5 

20 

59,928 

16.9 

21 

11,471 

25.2 

22 

38,469 

19 9 

23 

21,953 

18.1 

24 

13,913 

13 5 

25 

20,039 

19.2 


6. Find the mean of the following table by the short method, and 
check it by changing the assumed mean. 
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Weekly Wages Received bt 500 Women Employed in a Gaement 

Factoby 


Weekly wages, 

Women 

(X) 

(/) 

$2 50- 3 49 

5 

3 50- 4 49 

71 

4 50- 5 49 

126 

5 50- 6 49 

132 

6 50- 7 49 

98 

7 50- 8 49 

47 

8 50- 9 49 

23 

9 50-10 49 

9 

Total 

511 

7 . What is the mean of the table below? 


Annual Net Incomes of 150 Louisiana Cotton Fakms, 1936 

Income 

Farms 

(X)* 

(/)t 

$ 500 

62 

750 

45 

1000 

23 

1250 

8 

1500 

6 

1750 

2 

2000 

2 

2250 

1 

2500 

_1 

Total 

. 150 

* Mid-point 
t Frequency 


8 . Calculate the mean number of years on farm reported by Iowa 
farmers in 1929. Use deviations from an assumed mean. 

Iowa Farm Operators Classified According to Number of Years on 

Farm, 1930 



Years on farm 

Farmers 

(X) 

(/) 

Under 1 year . . 

25,625 

1 year 

20,140 

2 to 4 years 

36,496 

5 to 9 years ... .... 

33,465 

10 years and over* 

.... 92,142 

Total .... 

207,868 


(Abstract of tlie Fifteenth Census of the United States, 1930, p 582) 

* Take the mid-pomt of this interval at 15 years 

9 . The arithmetic mean of the number of years on farm reported by 
249,588 Alabama farmers in 1930 was 6.1. What is the mean number 
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of years reported by Iowa and Alabama farmers combmed, using the 
data of Exercise 

10. The counties of Oklahoma are to be grouped according to their 
infant mortahty rates in 1939 as pubhshed by the Oklahoma Bureau of 
Vital Statistics, with the purpose of correlating these rates with, the 
per capita expenditures for pubhc schools Have you any criticisms 
of this method? 

11. A writer on the family recently made this statement: The Census 
Report for 1930 showed the average size of the American family to be 
3 81 persons. But averages tell us httle.’’ Can you suggest any 
important information that this average conceals*^ 

12. Can you propose a refinement of the crude marriage rate analo- 
gous to the gross reproduction rate described in the text? How would 
it differ in meamng from the present crude rate? 

13. Calculate the net reproduction rate for your state, and explain 
its meaning Is the population of the state increasing or decreasing 
at present? If the answer to this question seems to contradict the net 
reproduction rate found, can you reconcile the difference? 

14. What do you consider to be the most meaningful base for a divorce 
rate and why? 

16. At what mean rate did the population of Nashville, Tenn , increase 
between 1880 and 1890? Between 1920 and 1930? Plot the observed 
populations first on rectangular coordinate paper, then on semiloga- 
rithmic paper, and study the differences 

PoptriiATioN OF Nashville, Tenn, 1870-1930 


Census 

Population 

1870 

25,865 

1880 

43,350 

1890 

76,168 

1900 

80,865 

1910 

110,364 

1920 

118,342 

1930 

153,866 


16. Using the data of Exercise 15, compare the geometric and arith- 
metic mean populations of Nashville, Term., between 1870 and 1930, and 
plot them in the graphs prepared m Exercise 15 Explain the results. 
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CHAPTER VIII 

MEASURES OF DEVIATION AND PARTITION 

1. Deviation from an Average. — It is seldom possible to give 
a good idea of a series of ungrouped values or of a frequency dis- 
tribution by means of a single value, or average, alone It is 
generally wise to exhibit the whole distribution m tabular form, 
and often to show it graphically as well Mention of the range of 
the values, i e , the highest and lowest values in the series and the 
difference between them, is desirable. It is also important to 
accompany the average mth some measure of variation or dis- 
persion The purpose of a measure of dispersion is to show the 
extent to which the individual items in a series vary from their 
average If the average value of the items is known, and also 
the amount by which a certain proportion of the items deviate 
from that average, a rather satisfactory idea of the distribution 
may be conveyed For example, note the ungrouped items 4, 1, 

6, 7, 3, 9, 2, 1, 3, 4, representing the number of years between 
marriage and divorce in the case of 10 divorced couples Their 
mean is 4 years Six out of the 10 cases do not differ from the 
mean by more than 2 years If, therefore, we descnbe the 
distribution to the reader by saying that the mean time between 
marriage and divorce is 4 years, and that three-fifths of the cases 
do not deviate from the mean by more than 2 years, he should 
have a better notion of the distribution than if we merely told 
him to imagine 10 couples whose mean time between marriage 
and divorce was 4 years 

2. The Average Deviation. — The simplest of the measures of 
dispersion is obtained by finding the amount by which each item 
deviates from the average value, adding these without regard to 
sign, and dividing the sum by the number of items, to obtain 
the average amount of deviation Such a measure of deviation 
or dispersion is appropriately called the average deviation, and 
is often represented by the symbol A D 

In the case of ungrouped data, hke the above series, 4, 1, 6, 

7, 3, 9, 2, 1, 3, 4, representing the number of years between mar- 

122 
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riage and divorce for 10 divorced couples, the average deviation 
from the mean value of 4 years is found as shown in Table 33 

Table 33. — Computations for the Mean Deviation, Ungrouped Data 

X - M.* = ar 
4-4-0 

1 - 4 = -3 

6 - 4 - +2 

7 - 4 - +3 
3 -4 - -1 
9 — 4 = 4-5 

2 - 4 = -2 
1 - 4 - -3 

3 - 4 = -1 

4 -4 - 0 

10 

XW = 20t 

* Mx indicates the mean of the X values 

10 

t The Imes ] | mdicate that signs are disregarded means to add the 10 items 

If we add the values of x with respect for the signs, the result is 
zero Disregarding signs, however, the total is 20, and 

^ D. == = 2. 

That is, the 10 values differ on the average from their mean by 2 
years 

A formula for use with grouped data is 

where / is the frequency in any class interval, X is the value or 
mid-pomt corresponding to a given frequency, Av is the average 
used (mean, median, or mode — usually the mean), x — X — Av, 
and N is the number of items or the sum of the frequencies (/). 
The calculation of the A D. from the mean, ikT, is illustrated in 
Table 34 In the table, the x's are obtained, of course, by 
subtracting the value of the mean, 0.67, from each of the X 
values 

There are short methods of finding the average deviation from 
the mean or median, but they are rather cumbersome and will 
not be described here ^ 

^ See, for example, H Sorenson, Stattstzcs for Students of Psychology and 
Education, p 137, McGraw-Hill Book Company, Inc , New York, 1936. 
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Table 34. — Ntjmbee of Previous Arrests Recorded for 100 Murderers 


Previous arrests 

m 

Prisoners 

if) 

fx 

X 

i/^1 

0 

60 

0 

-0 67 

40 20 

1 

20 

20 

+0 33 

6 60 

2 

15 

30 

+1 33 

19 95 

3 

3 

9 

+2 33 

6 99 

4 

2 

8 

+3 33 

6 66 

Total 

100 

67 


80 40 


M = 

AD. = 


1^- 0.67. 
80 4 


100 


= 0 804. 


The average deviation is usually smaller when taken from the 
median than when taken from the mean or the mode 
3. The Standard Deviation, — Because the average deviation 
disregards negative signs, another measure of dispersion, known 
as the standard deviation, has been devised, which is free from 
this objection. It is found by subtracting each X value, or 
in grouped data each mid-point value, from the mean of the X 
values, squaring these differences to make all signs positive, 
multiplying them by their respective frequencies, summing them, 
dividing by the sums of the frequencies, and extracting the 
square root The formula for the standard deviation is, there- 
fore, 

S(X - Mx)2 

N ’ 

or 

S/(Z - M.y 
N 

Letting x = X — Mx, 

IT’ 

or 

* The Greek letter, small sigma, cr, is conveiitionally used to represent the 
standard deviation. 





( 22 )* 

(23) 

(24) 

(25) 
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where <r is the standard deviation of the X values, X is the value 
of an item or the value of the mid-point of a group of items, / is 
the frequency of the items in a group or class interval ' (for 
ungrouped data, / = 1), and N is the number of items, i.e., 
JV = S/. 

To save labor in computing the standard deviation for a large 
frequency table, a short method is commonly used: 


= W¥-(f)* 


(26)’ 


where d is the deviation of the nud-points from a guessed mean in 
class interval units, i = width of class interval. This formula 
may also be modified for use with ungrouped data by taVing the 
assumed^ mean at zero, so that d = X,f = 1, and i = 1: 


/SX2 /2Z\* 
^1 N \NJ^ 


or 




SX^ 

N 


- ilf 2 


(27) 

(28) 


1 Derivation of formula (26) . 
By defimtion, 


<r 


-4 


S/(X ^ 


N 


From Chap. VII, formulas (10) and (13), 

X == A -f” “wi 
M^A+i^. 

Substitutmg from (a) and (6) m (23) 


<r 


<7 


<r 


ff 


<r 


-Wf-’-(f)’+f(f)‘ 

-WT-(f)’ 


(23) 

(o) 

(&) 


(C) 


(26) 


* See Chap. VII. 
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In Chap VII, we had the ungrouped data, 3, 7, 2, 12, 1, 16, 4, 
representing the numbers of children in seven Italian immigrant 
families. The mean number of children per family was found 
to be 6.43. What is the standard deviation^ If we use the 
long method of formula (22) or (24) above, we require the com- 
putations shown in Table 35 

Table 35. — Computations fob the Standard Deviation, Ungrouped 

Data 

(Long method) 


X 

1 

a 

(X - 

3 

3-6 43 = -3 43 

(-*3 43)2 = 11 76 

7 

7-6 43 = +0 57 

( 0 57)2 = 0 32 

2 

2-6 43 = -4 43 

(-4 43)2 == 19 62 

12 

12-6 43 - +5 57 

( 5 67)2 = 31 02 

1 

1-6 43 = -5 43 

(-5 43)2 = 29 49 

16 

16-6 43 = +9 57 

( 9 57)2 = 91 5s 

4 

4-6 43 = -2 43 

(-2 43)» = 5 90 

Total 


189 69 


Substituting in formula (22), 

For the short method of formula (27) or (28), we need only the 
two totals, as shown in Table 36. 

Table 36 — Computations for the Standard Deviation, Ungrouped 

Data 

(Short method) 


X 

X2 

3 

9 

7 

49 

2 

4 

12 

144 

1 

1 

16 

256 


16 

45 

479 


Substituting in formula (27), 

^ - (¥y, 

tr = 5 21, 
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as before The saving of labor in comparison with the first 
method is evident 

Let us next find the standard deviation of Table 34 above, 
and compare it with the average deviation previously obtained 
for the same table W e shall agam first employ the long method, 
to clarify the meaning of the arithmetic, and to enable the 
student to compare the amount of work required relative to the 
short method to follow. The formula that describes the long 
method for grouped data is formula (23) or (25), which calls for 
the computations shown m Table 37. The mean of the table is 
0 67. 


Table 37. — Computation op Standard Deviation poe Table 34 
(Long method) 


X 

/ 

1 

II 

f(X-Ms) =fx 

=fx’- 

0 

60 

-0 67 

-40 20 

26 93 

1 

20 

+0 33 I 

+ 6 60 

2 18 

2 

15 

+1 33 

+19 95 

26 53 

3 

3 

+2 33 

+ 6 99 

16 29 

.4 

2 

-h3 33 

+ 6 66 

22 18 

Total 

100 


0 00 

: 94 11 

1 


Substituting in formula (23) or (25), 

Turning now to the short method of formula (26), the steps are 
worked out in Table 38 Notice the so-called Charlier check 


Table 38 — Computation of Standard Deviation for Table 34 
(Short method) 


X 

/ 

d 

fd 

fd^ 

fid + 1)^ 

0 

60 

-1 

-60 

60 

0 

1 

20 

0 

0 

0 

20 

2 

15 

+1 

+15 

15 

60 

3 

3 

+2 

+ 6 

12 

27 

4 

2 

+3 

+ 6 

18 

32 

Total 

100 


-33 

105 

139 
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included in Table 38: S/ + 2 S/d! + S/d^ = ^f{d + 1)2^ or 
100 + 2 (—33) + 105 = 139, which is the total of the last 
column of the table. This checks all of the work of the table. 
Substitution m formula (26) now gives 



O' = 0.97, 


which is the value reached by the long method.^ 

The average deviation of Table 34 was found in Sec. 2 above to 
be 0.804, while we see that the standard deviation is 0 97 The 
standard deviation is always larger than the average deviation, 
because squaring the differences gives greater weight to the 
extreme values. 

Because of the inaccuracies due to grouping data in class 
intervals, the standard deviation squared, called the variance, 
of a distribution that is fairly symmetncaP in form is often cor- 
rected by subtracting from it the value i^/l2 in the case of a 


continuous variable, or 



in the case of a discrete 


variable. In the above problem the variable is discrete, so 
that we have (0.97)^ — (t^ — - 3 ^) == (0 97)^, and cr remains 
unchanged There is no error of grouping when the variable is 
discrete and ^ = 1 This correction is known as Sheppard^s 
correction In its usual form it cannot be applied to very skewed 
or asymmetrical distributions. 

If we have calculated the standard deviation of each of two 
series, and then wish to know the standard deviation of the two 
series combmed, the latter may be found from the formula 


i(cri^ + ilf 1^) + N 2 ( 0 - 2 ^ + (29) 

where the subscripts differentiate the two series, and no sub- 
script indicates the combined series. Where there are more than 



^ In Table 38, it happens that the X values are already in unit step devia- 
tion form — 0, 1, 2, etc — so that very httle labor is saved by using the d 
column. We might, therefore, have used X m place of d in formula (26) 
The student is asked to do this as a check on the calculations m Table 38. 

* The distribution should be normal in form See Chap. IX. 
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two series, a term iV,(cr,2 + Af,^) is inserted in the formula for 
each additional series. 

For example, for Table 34 we have Ni = 100, o-i® = 0.94, 
and Mi^ — 0.45. In a second sample of the same kind, given 
iVa = 80, 0 - 2 ® = .6302, and = 4 56. From formula (29), for 
the two samples combined, we find 


(T 

a 


'100( 94 + 45) + 80(.6302 + 4.56) 
180 

1.16. 


4 


1.74, 


Just as the average deviation is usually a minimum when 
taken from the median, so the standard deviation is a minimum 
when taken from the mean. In fact, the standard deviation is 
practically never taken from any average except the mean, and 
formulas (27) and (28), above, are valid only for the mean. 

4. Effect of Coding^ on Averages and Measures of Dispersion. 
If the frequencies in a frequency table are divided through by a 
constant, k, the averages and measures of dispersion or partition 
calculated from the table wiU not be changed. Since it is 
possible to simplify the computation in this way, it is desirable 
to use this device whenever the opportunity offers. 

The student is asked to test this for himself, using Table 39, 
in calculating the mean and the standard deviation. 

Table 39. — Mean Annvai. Income of 500 Clebical Wobkbbs 


Mean Income 

Families 

(X) 

Cf) 

S 500 

25 

1,000 

150 

1,500 

200 

2,000 

75 

2,500 

50 

Total 

500 


It is also often convenient to reduce the absolute frequencies to 
percentage frequencies before using them in computation. 

6. The Coefficient of Variation. — The average or standard 
deviations of two frequency distributions are not directly com- 
parable, because they depend upon the size of the mean or 
median in each case, and upon the particular unit used. For 
example, the weights of a herd of elephants may vary on the 

Dividing the frequencies of a distribution by a constant. 
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average by 280 lb., while the weights of a litter of mice may differ 
by 0.1 oz Yet the mice may show a greater variation than the 
elephants relative to their mean weights. Average and standard 
deviations may, therefore, be made comparable by expressing 
them as percentages of their means or medians This percentage 
is called the coefficient of variation in terms of the average or 
standard deviation, and is wntten 


or 


and 


lOOA D. 
M ' 

lOOA D. 
“ Md ' 

_ lOOo- 

M * 


(30) 

(31)* 

(32) 


It IS possible to use the coefficient of variation, V, as a measure 
of the representativeness of an average. It may be said, arbi- 
trarily, that when V is above 50 per cent, it is usually advisable 
to abandon the use of an average as a single value intended to give 
an idea of the central tendency of a series The V calculated for 
the mean of Table 34 above by formula (30) is 


„ __ 100(0 804) 
0.67 


120 per cent. 


In this case, V is 70 points above 50 per cent; hence the mean is 
obviously a poor device for representing the actual values in this 
very skewed or J-shaped distribution If we apply formula (30) 
to the mean of Table 40, below, which is merely a rearrangement 

Table 40. — Peevious Arrests Recorded for 100 Murderers 
(Frequencies of Table 36 rearranged) 


^ / 

0 2 

1 20 

2 60 

3 15 

4 ^ 

Total 100 


M = 1 97, 

A D = 0 467. 

* Only one of these formulas should be used in the same comparison 
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of the frequencies of Table 34 in more symmetrical form for 
purposes of illustration, we find that V = 100(0 467)/L97 = 24 
per cent, indicating that the mean represents the values in this 
table very well This result would be expected from an inspection 
of the distribution, which appears to be fairly symmetrical in form, 
with the largest frequency in the center. 

In using formulas (30), (31), and (32), it will be seen that if 
two distributions have equal average or standard deviations, but 
unequal means or medians, the one with the larger average will 
have the smaller coefficient of variation, Y, This is as it should 
be, provided that the means or medians used in finding the F's 
contain no element that spuriously raises or lowers the values 
from which the averages are calculated. 

Suppose that the question is asked, Does Table 34 or Table 40 
show a greater amount of variability from the mean? In the 
case of Table 34 it has been seen that Y == 120 per cent, and for 
Table 40 it was found that V = 24. The Y% therefore, show 
that Table 34 is = 5 times as variable as Table 40, whereas 
the average deviations would indicate that the former distribution 
was less than twice as variable as the latter. 

6. Partition Values. — To show the scale values below which 
any desired proportion of the frequencies in a distribution fall, a 
set of partition values known as quartileSj deciles, etc , or more 
inclusively as percentiles, has been devised. These measures 
aU employ the principle of the median, and apply primarily to 
grouped data. Thus, while the median is that scale value below 
which half of the values fall, the first quartile, Qi, is the scale 
value below which he one-fourth of the values; the third quartile, 
Qz, is the scale value below which he three-fourths of the values; 
the ninth decile, dg, is the scale value below which he 90 per cent 
of the values, the 65th percentile is the scale value below which 
he 65 per cent of the values; and so on. It is, therefore, seen 
that each of these measures is merely a particular percentile 
value, the median corresponding to the 50th percentile, the first 
quartile to the 25th percentile, the third quartile to the 75th 
percentile. The general method of finding any value is the same. 

Because of logical difficulties, it is seldom that any partition 
value except the median is found for ungrouped data ^ 

^ If the attempt must be made, however, it is generally best to accept 
rough approximations, rather than msist on exact but imaginary interpola- 
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For grouped data, it will be recalled that the median is located 
by dividing the total frequency, N, by 2, counting up the column 
of accumulated frequencies of the table until the lower limit of 
the class interval is reached which contains the median value, 
and then interpolating within this interval to determine the 
median value. When finding any percentile value other than 
the median, we need only change the coeflBicient of the total 


tions. For example, if we are required to furnish the third quartile, Qs, 
for the array of 12 ages — 3, 5, 6, 9, 11, 16, 20, 21, 24, 25, 26, and 30 years — 
we may find the position 12 X 0 75 = 9, and say that 100 (j\) — 75 per cent 
of the ages are less than the age of 25 years that occupies 9 + 1 = 10th place 
in the array. This statement is correct in the present case; but it is not 
correct to say, further, that 100 — 75 —25 per cent of the ages are greater 
than 25 years. If the age 24 years in the array were replaced by a second 
age 25 years, then the age 25 years would no longer be greater than 75 per 
cent of the ages, but it would stiU probably be the most appropriate age to 
offer as an approximate value for Qz, 

When the position, Np, found by multiplying the total number of items, 
Nj by the given percentage value, p, is not a whole number, the matter is 
more complicated Thus, if we drop the age 30 years from the top of the 
above array, we have Np = 11 X 75 = 8.25. There is no 8.25th position 
in this array, so we have to choose between positions number 8 and 9, or 
else interpolate between them If we take position 8 as the nearest mteger, 
and add one to it, as we did above, we get position 9. The age correspond- 
ing to this position is 24 years, and we see that eight ages, or 100 (x®x) = 72 7 
per cent of the ages, are less than this age. Smce 72 7 per cent is rather 
close to 75 per cent, the age 24 years seems to be the simplest approximate 
value to assign to Qz 

Only when no actual position m an array gives a reasonably close approx- 
imation to the meaning of a required percentile is it usually worth while to 
interpolate between two positions. If our array above consisted of only the 
first 10 ages, to find Qz we would have pN — 0.75(10) = 7.5. The age in 
the eighth position is greater than 100(3^) = 70 per cent of all the ages, 
whereas that in the ninth position is greater than 100 ( 3 ^) = 80 per cent 
of the ages Here we might prefer to take the interpolated position, 
7 -f- 8 

— 2 — = 7 5, so that, assuming covXinuous or grouped data, the theoretical 


age corresponding to it would be greater than 100(7 5/10) — 75 per cent 
of the ages m the array. This theoretical age, or value of Qz, must be 
halfway between age 20 in seventh position and age 21 m eighth position, or 


20 +21 
2 


20.5 years. 


Notice that, in ungrouped data, the empirical formula, Np + 1, used 
for locatmg the approximate mtegral position of such a partition value as 
Qz, is replaced by the formula piN + 1) for determmmg the median position. 
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frequency, N. For example, in the case of Qi, we use iV/4, for 
Qz, ZNJi, for tfs, 0.9iV, for the 65th percentile, 0 6SiV, and so on. 
The general formula, using P to represent any percentile, median, 
decile, or quartile value on the X scale, is 

+ ( 33 , 

where p is the percentile rank or point of division on the frequeTict/ 
scale expressed in percentage form (e.g , p = 0.75), L is the lower 
limit of the interval containing the pth value, N is the total 
frequency of the table, F is the sum of the frequencies falling 
below (i.e,, in class intervals with limits smaller than) L, / is 
the number of frequencies in the interval containing the pth 
value, and i is the size of interval containing the pth value. 

Let us find the values of Qi, Qz, dy, and pzz (33rd percentile) 
in Table 41. 


Table 41 — Distribution of the Estimated Income among Unmaeeibd 
Women of the United States in 1910* 


Income 

a) 

Women 

(F) 

(Y) 

Accumulated 

Percentage 

accumulated 

$ 100- 199 

10 

10 

0 55 

200- 299 

70 

80 

4.42 

300- 399 


640 

35 36 

400- 499 


1,170 

64 64 

500- 599 


1,450 

80 11 

600- 699 

150 


88 40 

700- 799 


1,710 

94 48 

800- 899 

37 

1,747 

96 52 

900- 999 

22 

1,769 

97 73 

1,000-1,099 1 

16 

1,785 

98 62 

1,100-1,199 i 

12 

1,797 

99 28 

1,200-1,299 ! 



99 72 

1,300-1,399 
Total 


1,810 

100 00 


* From W. I King, Wealth and Income of thePeople of the Umted States , p 224, 1915. 


To find Qi, we have 

pN = .25(1,810) - 452.5. 

Counting up (i e., in the direction of increasing values on the X 
scale) the accumulated frequency colu mn of the table, we see that 
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452 5 lies in the class interval 300-399. Therefore, 

L = 300 
F = 80 
/ = 560 

t = 100 . 


Substituting in equation (33), 


= 300 + - • 100, 


560 


= 366.5 


That is, one-fourth of the women earned less than $366 50 a year. 
Sinailarly, 

Q3 = 500 + • 100, 


or 


or 


or 


280 


Qs = 567, 

, . 1,267 - 1,170 

di = 500 + ^ 280"^ 100, 

d^ = 534 6, 

P3, = 300 + 597 3^- 80 . 


560 


P33 = 392 4. 


From these results we notice that three-fourths of the working 
women made less than $567 annually, 70 per cent of them made 
below $534,60, and one-third made under $392 40. Of course, 
there is no point in calculating all of these values except for 
illustrative purposes. We are usually interested in such fractions 
as one-third, one-half, or three-fourths 
An investigator often requires, not the value below which a 
certain percentage of the frequencies fall, but the reverse of this, 
namely, the percentage of the cases that falls below a certain 
value, that is, the percenhle rank of the value. Referring back 
to the ungrouped array of 11 ages used above, viz , 3, 5, 6, 9, 11, 
16, 20, 21, 24, 25, and 26 years, we may require the percentile rank 
of the person aged 21. Smce, by definition, this is equivalent to 
asking what percentage of the persons in the array are less than 
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21 years of age, we note that there are 7 persons out of 11 who 
are younger than 21 years, and compute ^ = Q 636, or 63.6 per 
cent. We then say that the percentile rank of the person aged 
21 is approximately 64 

Turmng to grouped data, suppose we ask what proportion of 
the unmarned women of Table 41 earned less than some mmiTunTn 
living wage, say $550 a year Our problem now is, knowmg a 
value on the X scale, to find the percentage of values on the Y 
scale that fall below it In the present case, it is evident that 
1,170 women earned less than $500, and that 280 earned between 

$500 and $599. We have ^qq - ^ (280) = 140, as the 


number of women earmng between $500 and $550. Therefore 
1,170 + 140 = 1,310 is the number of women who made less 
than $550. Expressed as a percentage of the total number of 


women workers, we find that 100 



= 72 per cent of the 


women failed to earn as much as the minimum amount. A 
formula for this calculation is 


V = 


/(P - L) 1 100^ 
^ N ^ 


(34) 


where p is the percentile rank sought, P is the given X scale value^ 
F is the accumulated frequencies in the class mtervals with 
limits smaller than those of the mterval mcluding P, / is the 
frequency of the interval including P, L is the lower limit of this 
same interval, i is its width, and N is the total frequency of the 
table Thus, substituting the values of the preceing problem 
in formula (34), we get 


V = 


1,170 + 


280(550 - 500) 
100 


100 

1,810’ 


or p = 72 per cent, as before 

An X scale value corresponding to a given accumulated fre- 
quency, or a percentage frequency corresponding to a given X 
scale value, may also readily be found by means of a cumulative 
curve, which was described in Fig. 11, Chap VI. The student is 
asked to use this device to check the arithmetical results just 
obtained from Table 41 above, preferably plotting the curve from 
the last column of that table. 
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A measure known as the quarhle deviation is sometimes used. 
The formula is 

e - (35) 

Thus, for Table 41, 

Q = 567^^5 = 100 25. 

Ji 

The quartile deviation is employed only when the median is the 
preferred average. 

All these measures of dispersion — quartiles, deciles, percentiles, 

quartile deviation — are so-call- 
X ed position values, and have the 

^ * N. same advantages and disadvan- 

r '~r “ I tages as the median, previously 

<5/ Md Qj discussed. In particular, they 


Fig. 37— The distance Qz-Qi m- are insensitive to extreme 

eludes half of the cases , ^ i i i 

values, and cannot be treated 
algebraically. They are especially useful in analyzing a skewed 
frequency distribution, since they maintain a definite relationship 
to the distribution, regardless of its shape* 

7. Comparable Measures or Scores. — When two frequency 
distributions are of about the same shape, e.p., both about 
symmetrical, both slightly skewed in the same direction, both 
J-shaped, etc , distances on their scales are usually compared 
in units of their respective standard deviations. Thus, if we 
have the distributions of many scores on two independent tests 
of a given trait, for each test the deviations of the scores from 
the true^ mean are divided by the true standard deviation, 
to get the desired standard scores. Given, for Test I, true 
mean == 70, true <r = 10; and for Test II, true mean = 62, 
true <r = 12. If subject A scored 80 on Test I and 60 on Test II, 

gQ fJTQ 

his standard score on Test I is — — = 1, and on Test II is 

0Q 02 

.17; and his combined score on the two tests is 
1 + (—.17) = 0 83. If subject B scored 75 on Test I and 65 


on Test II, his corresponding standard scores are 


= 0.50 


1 By true is meant a statistic derived from many applications of a test, 
rather than from a smgle application, to the same universe or type of subjects. 



137 


MEASURES OF DEVIATION AND PARTITION 


on Test I, and 


65 - 62 
12 


0 25 on Test II; and his combined 


score is 0.75. 


Where two distributions differ markedly in form, e.g., one being 
about symmetrical and the other J-shaped, or one very peaked 
and the other fiat, the standard deviations do not provide con- 
sistent units for reducing their scale distances to more comparable 
terms, because the proportion of frequencies included between 
the mean and one standard deviation on each side of it cbangog 
with the form of the distribution. Theoretically, perhaps the 
best procedure under these circumstances is to normahze both 
distributions, but the method is too complex to introduce here.^ 
A cruder but much simpler method uses the Q’s instead of the o-’s 
as common denominators. Although Q also has disadvantages, 
it IS one-half of the range Qs-Qi, withm which always falls 
the middle half of the frequencies; and in that sense its interpre- 
tation is independent of the shape of the distribution (see Fig. 37). 

Suppose now that the distribution of many scores in Tests I 
and II above are qmte different, bemg J-shaped to the left 
in Test I and skewed to the right m Test II. For Test I the 
true median score is 74, and the true Q value is 6; for Test II, 
the median score is 59 and Q is 8 We divide the deviations 
of the two subjects’ scores from the medians by the respective 


^ j , 80 - 74 

Q values, and get ^ 

score of subject A, and 


+ 

75 


60 - 59 
8 

74 , 


— 1.125 as the combined 
= 0.917 as the 


combined score of subject B. These may be called the Q scores. 

Instead of the standard scores or Q scores described above, the 
method of equivalent percentile scores may be used in the effort to 
make two independent scales comparable For each scale, every 
percentile or, say, every fifth percentile is found, and these values 
are arranged m two parallel series, where correspondmg pairs of 
values are regarded as equivalent. Thus, in Table 42 below, the 
values Xi = 13 and X 2 = 0 1, are equivalent on the two scales. 
The percentile values are found arithmetically from the two 
given frequency distributions of scale values by formula (33) 


1 Paul Horst, Obtaining Comparable Scores from Distnbutions of Dis- 
similar Shape, Journal of the American Statistical Assocmtionj Vol. 26, pp. 
455-460, 1931 
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above, or graphically from the ogive curve as illustrated in Fig. 
11, Chap. VI Suppose that we wish to compare the score of 
subject 114 on Test Xi, 85, with the score of subject 17 on Test 
X 2 , 2 6. From Table 42 we see that a score of 2 6 on scale X 2 
is equivalent to a score of 93 on scale Xi Hence the two 
comparable scores are 85 and 93. If either or both of the scores 
of subjects 114 and 17 did not appear in Table 42, we would 
find the percentile rank of say the second of them by formula 
(34) above, and then, using this in formula (33), find the cor- 
responding value on the Xi scale This equivalent Xi value 
would then be compared with the Xi score of the other subject. 

Table 42 — Two Series of Equivalent Percentile Scale Values: 
Attitude toward War 


n 

Scale, Xi 

Scale, X 2 

{PnV 

(Pn) 

5 

13 

1 

10 

23 

4 

15 

32 

5 

20 

41 

6 

25 

49 

65 

30 

56 

8 

35 

63 

1 0 

40 

69 

1 2 

45 

75 

1 4 

50 

80 

1 6 

55 

85 

1 9 

60 

89 

2 2 

65 

1 93 

2 6 

70 

95 

2 9 

75 

97 

3 2 

80 

97 5 

3 6 

85 

98 

3 9 

90 

98 5 

4 2 

95 

99 

4 6 

100 

100 

5 0 


* nth. perceutlte- rscale value. 

The above 'three methods are not applicable to ungrouped or 
scanty data. 

When the data are inadequate, or when for other reasons we 
have more confidence in the ability of two scales to arrange items 
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in rank order than to measure dastances between them, simple 
percentile ranks may be used for purposes of comparison Given 
the scores on a test, the percentile rank is found for each score 
For example, if 62 per cent of the scores made on a test are 
less than the score 80, the percentile rank of the latter is 62 
For ungrouped data, the percentile ranks are found by the 
informal method outlined on page 132; for grouped data, the 
percentile ranks are obtained anthmetically from formula (34) 
above, or graphically from an ogive. The weakness of percentile 
ranks is, of course, that they do not reflect the distances between 
the scores on any scale Thus, the score 70 may have a per- 
centile rank of 50, the score 77 a percentile rank of 60, and the 
score 85 a percentile rank of 90, so that the successiU scores 
stand m the ratio of 1 1.1, whereas the correspondmg successive 
percentile ranks bear the ratios 1:1.2 and 1 1 5, respectively. 
For this reason, the difference between percentile ranks should 
not be interpreted as proportional to the distance between the 
corresponding scale values. 

As a matter of fact, there is usually no feasible method of 
treating scores obtained from the use of very different kinds of 
scales that makes them strictly comparable. 

Exercises 

1. Compare the average deviation and the standard deviation of the 
series below. Find the standard deviation by formulas (24) and (28) 
as a check 


Number of Dependents in 25 FAmniEs on Relief 


Family no 

Dependents 

Family no. 

Dependents 

1 

3 

14 

5 

2 

5 

15 

3 

3 

4 

16 

3 

4 

1 

17 

2 

5 

6 

18 

4 

6 

8 

19 

1 

7 

2 

20 

3 

8 

3 

21 

4 

9 

3 

22 

3 

10 

2 

23 

6 

11 

4 

24 

2 

12 

1 

25 

3 

13 

2 
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2. Compare the average deviation and the standard deviation of the 
following frequency distribution, using for the standard deviation 
formula (26) with the Charher check* 

Semester Hours of Mathematics Taken bt 67 Students in a Class op 
Elementary Social Statistics 
Semester Hours Students 


43 5-46 

4 

1 

40 5-43 

4 

0 

37 5-40 

4 

0 

34 5-37 

4 

0 

31 5-34 

4 

0 

28 5-31 

4 

2 

25 5-28 

4 

2 

22 5-25 

4 

5 

19 5-22 

4 

4 

16 5-19 

4 

8 

13 5-16 

4 

13 

10 5-13 

4 

26 

7 5-10 

4 

4 

4 5-7 

4 

1 

Total 

• • • < 

66 


3. Use the coefficient of variation, 7, to measure the representative- 
ness of the mean of the distribution in Exercise 2, above 

4. Below are two random samples of family incomes in a certain city, 
one taken in 1928, the other in 1932 Did the depression reduce or 
increase the spread in income between fa m ilies? 


Income 

Number of families 

1928 

1932 

Under S500 ... 

5 

76 

500-999 . , 

15 

123 

1,000-1,499 

115 

155 

1,500-1,999 

190 

91 

2,000-2,499 

82 

70 

2,500-2,999 

63 

52 

3,000-3,499 

27 

17 

3,500-3,999 

19 

12 

4,000-4,499 

10 

7 

4,500-4,999 

6 

3 

5,000-5,499 . 

3 

1 

Total . . 

535 

607 
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6. Using the standard deviations found for the 1928 and 1932 series 
in Exercise 4, compute the standard deviation for the two senes 
combined 

6. The table below shows the number of children who required the 
specified numbers of hours of social contact before they were accepted” 
in a certain play group, (a) What percentage of the children took less 
than 4 hours? (&) WTiat percentage of the children took more than 
10 hours? (c) How many hours did three-fourths of the children 
reqmre less than ? (d) How many hours did three-fourths of the children 

require more than? 


Hours 
18-19 
10-17 
14-15 
12-13 
10-11 
8- 9 
6- 7 
4- 5 
2- 3 
0 - 1 


Children 

1 

3 

2 

6 

10 

9 

8 

6 

3 

2 


Total 50 


7. Given two independent scales, X and F, for the measurement of 
“cooperation” between members of a random sample of urban families. 
Family A has a score of +1 2 on scale X, family B has a score of 86 on 
scale Y. Reduce these scores to as nearly comparable terms as you can. 


Scale X 

Families 

Scale Y 

Families 

-2 5 — 2 9 

4 

0- 9 

21 

-2 0 — 2 4 

12 

10-19 

68 

-1 5 — 1 9 

22 

20-29 

109 

-1.0 — 1 4 

45 

30-39 

140 

-0 5 — 0 9 

71 

40-49 

131 

0 0 — 0 4 

89 

50-59 

91 

0,0-A‘O 4 

116 

60-69 

74 

-fO 5-+0 9 

132 

70-79 

56 

+1 0-+1 4 

151 

80-89 

28 

4-1 5-4-1 9 

93 

90-99 

13 

4-2 0-+2 4 

60 

Total 

731 

4-2 5-1-2 9 

17 



Total 

812 
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8. Suppose that the frequencies for the Y scale in Exercise 7, are 

reversed end for end of the scale, while those for the X scale remain as 

they are. Convert these scores to a more comparable basis. 
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CHAPTER IX 


COMBINATION, PROBABILITY, AND THE NORMAL 
DISTRIBUTION 


1, Permutations and Combinations.^ — It is often desirable in 
sociological investigations to know the total number of ways in 
which a certain event can occur. For example, in a study of 
intercity migration among five cities, how many paths can the 
migration take? Or, among 10 girls in a boarding school, three 
two-girl friendships are found How many such friendships are 
possible in this group The same kind of problem arises in 
connection with the binomial formula, discussed in Sec. 3 below. 

To answer the question about the paths of migration, we 
notice that since a migrant may go from any of the five cities 
to any of the four remaining cities, the number of paths must be 
5 X 4 = 20 Not only do we count each pair of cities, but also 
the two orders or arrangements m which the members of a pair 
may be taken, as 'Trom a to 6,’’ and ‘^from 6 to A pair of 
cities in a given order, e g , ^^from a to 6,” is called a permutation^ 
and the general formula provided by algebra for finding the 
number of permutations of n thmgs taken r at a time is 


vPt = 


{n — ry 


( 36)2 


For the problem above, we substitute in the formula, and get 


p _ 5^ _ (5 X 4 X 3!) 

(5 -2)1““ 3! 

= 5 X 4 = 20, 

as before. 

Formula (36) is based on Theorem 1. 

^ For a fuller treatment of this subject, see any teict m college algebra, 
e ^ , H B Fme, College Algebra^ Chap XXV, Gnm and Company, Boston, 
1904 

is called factorial,” and means the product of all consecutive 
numbers from 1 through n. For example, 4^=4X3X2X1=24 

143 
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Theoeem 1. If an event A can occur in m waySj and thereafter 
an event B can occur m n ways, A and B can occur together in the 
order named in mn ways. 

A first approach to the problem of the boarding school friend- 
ships mentioned above can also be made by means of formula 
(36). The number of arrangements, or permutations, of 10 girls 
taken two at a time is 

10P2 = ^ = 10 X 9 = 90 


Here, however, there is no interest in the order of the girls in a 
two-girl friendship. When this is the case, i.e., when a group 
of things is taken without regard for the arrangement of the 
members, the group is called a combination. Evidently, each 
pair of girls can be arranged in two orders or permutations, so 
that the 90 permutations found above reduce to -^ == 45 combina- 
tions The formula for combinations is, therefore, 


n nB r 

” TT “* 


n\ 


Using it, we get again 

10C2 == 


10 » 


r^{n — t)\ 

10 X 9 X 8! 


(37) 


2 » 8 ! 

10X9 


2 ’ 8 ! 


= 50 = 45 

2x1 2 


Although formulas (36) and (37) apply to a large number of 
problems, some problems occur that are best approached inde- 
pendently. As an easy example, suppose we ask. What is the total 
number of possible relationships that can exist between two 
persons, X and Y, in terms of attraction, indifference, and 
repulsion? To each of the three attitudes of X, Y may respond 
with three attitudes, so that, by Theorem 1 above, we have 
3X3 = 9 relationships. These relationships are (1) mutual 
attraction between X and F; (2) X is attracted by F, but F is 
indifferent to X; (3) X is attracted by F, but F is repulsed by 
X; (4) Mutual mdifference between X and F; (5) X is indifferent 
to F, but F IS attracted by X; (6) X is indifferent to F, but F is 
repulsed by X; (7) mutual repulsion between X and F; (8) X is 
repulsed by F, but F is indifferent to X; and (9) X is repulsed by 
F, but F is attracted by X. 
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2. Probability. — Chance, often called “luck/^ and the tricks 
it plays are known to everyone. In a hand at cards, one may 
draw no ace; one, two, three, or even all four aces Whether a 
person is male or female, white or black, European or American, 
is, as far as he is concerned, purely an accident. The occupation 
one follows, the person one marries, the state of one's health, 
and so on, are also subject to a great amount of chance. Dis- 
covery and invention, even the trend in the development of a 
nation's culture in the sociological sense, depend in part on 
thousands of small forces of which we have no knowledge. 
If the birth rates in a city differ in 1939 and 1940, is it because 
fundamental conditions affecting fertility have changed, or is 
the variation due merely to accidental factors that will cancel 
out over several years? In one random sample of old people 
there may be more male than female survivors and in another 
sample exactly the reverse, regardless of the true proportion 
in the population. It is, therefore, not surprising that any 
^ careful attempt to investigate social life or culture is obliged 
to reckon with this element of chance. Chance distorts the 
findings of research, and must be allowed for. 

One of the greatest practical contributions of mathematics 
has been its discovery, beneath apparent confusion, of a remark- 
able regularity in the occurrence of chance events. By mathe- 
matical means, we can estimate the amount of variation due to 
chance and predict the number of occurrences of any event whose 
probability is known, e g , the annual deaths in a class of insur- 
ance risks. On these mathematical laws of probability are 
founded great business enterprises like insurance, as well as the 
basic techniques of a vast amount of scientific and industrial 
research. 

The exact mathematical definition oj probability is this: If an 
event can succeed in m ways and fail in m' ways, all equally 
likely and mutually exclusive, and the event must either succeed 
or fail, the probability of its succeeding is 


m 

P = \ 7 ? 

^ m + m' 


and that of its failing is 


ml 

m + m' 


(38) 


( 39 ) 
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That is, 


m + mf 1 
^ ^ m + rnl 1 


1 . 


( 40 ) 


In other words, since an event must either succeed or fail, the 
probabihty of certainty is one in one, or unity. 

The proportion of ways m which an event can succeed may be 
determined for practical purposes by one of two methods, or by 
both. In the case of a penny, we decide that the probability of 
throwing a head is -I, by reasoning that the penny has only two 
sides and is equally balanced so that one of them is as hkely to 
turn up as the other. This is an illustration of the theorehcal or 
a pnon method By the so-called empirical method, the chance 
of death within a year of a white male, aged 30, engaged in a 
clerical occupation, married, and an medical risk, is found 
by simply counting the proportion of annual deaths occurring 
among a very large number of such individuals (say, 354 deaths 
among 85,707 persons, giving a probabihty of 0 00413) The 
empirical method is sound if the probability tends to approach a 
limit, as the estimate is based on an ever-increasing number of 
cases under essentially the same conditions. In both methods, 
of course, it is supposed that the conditions under which the 
probabihty was obtained will hold approximately for all situa- 
tions to which the probability is apphed For example, if each 
added count of deaths in a nsk group hke that descnbed above 
causes the average probability of death to approach nearer to 
some figure 0 00400, then 0 00400 may be regarded as an approxi- 
mation of the true (expected) proportion that exists in the given 
class as a whole (an infinite universe) But it would obviously 
be wrong to apply this death rate to a class in which the age was 
40 instead of 30 years! 

Two basic theorems of probabihty are 

Theorem 2 Of two mutually exclusive^ events, A and^ B, 
if the event A has a probability of occurring, p, and the event B 
has a probability of occurring, p', the probability that either A or 
B will occur in one possible way is p p'. 

^ Two events are mutually exclusive when m a smgle trial only one of 
them can happen In a hand at cards, drawmg an ace and drawmg a jack 
are mutually exclusive events, but drawmg an ace and drawing a diamond 
are not, because both may appear on the same card If the two events are 
not mutually exclusive, the probability is p + p' — pp'. 
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Theorem 3 If event A has a pTobahility of occutring, p, and 
an event B has a probabiVity of occurring with or after A in one pos- 
sihle way, p' j ^he probability that both A and B will so occur is pp\ 

A first application of Theorem 3 may be made to a typical 
problem A community is inhabited by two groups of different 
nationahty and religious backgrounds, Swedish Lutherans and 
German Cathohcs. Among the Lutherans in the age class 40 to 
45 years are 40 females and 44 males, among the Catholics 62 
females and 58 males, all married to someone included in the 
enumeration. The records show 18 mixed marriages, 11 between 
Lutheran males and Catholic females, and 7 between Catholic 
males and Lutheran females How does this observation com- 
pare with the number of mixed marriages that would be expected 
if there were no prejudice for or against them in the community? 
We set up the totals of Table 43. By the definition on page 145, 
the probability of a marriage occurring in row (1) is in/N = 
and of a marriage occurring in col (1) is n^/N = By 


^ Table 43. — ^Fourfold Table for Determining Probability op Mixed 

Marriages 



Males 

Females 

Lutheran 

Cathohe 

Total 


(1) 

(2) 

i (3) 

_ — — j 

Catholic (1) 

Lutheran (2) 

= 26 7 
j/i = 17 3 

1/2 - 35 3 
2/2 - 22 7 

11 II 

CO 

Total (3) 

ni = 44 

712 — 58 

102 = N 


Theorem 3, the probabihty of a marriage occurring in both row 
(1) and column (1) is iin/N)(ni/N) == therefore the 

expected number of Catholic women marrying Lutheran men 
is {in/N)ini/N){N) = inrii/N = 62(44)/102 = 26.7. This ex- 
pected frequency is entered in the proper cell in the table. 
Similarly, the expected frequency in the cell common to row (2) 
and column (2) is 40(58) /102 = 22.7. Thus the total number of 
expected mixed marriages is 26.7 + 22.7 = 49.4, or approxi- 
mately 49; whereas, the observed number is 18, only 36 per cent 
of the expected number. Evidently, there are obstacles in the 
way of marriages between the Swedish Lutherans and the German 
Catholics in this community. 
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TMs conclusion may be more fully established by applying the 
Chi-square (x^) method to Table 44. This method is designed 
to test the hypothesis that the differences between a set of 
observed and expected frequencies may be due solely to chance 
To obtain we subtract each expected frequency (/t) from 
the corresponding observed frequency (/o), divide the squared 
difference by the expected frequency, and sum these ratios. 
The calculations are shown in Table 44. 


Table 44. — Cni-SQTrARE (x®) Test 


Females 

Males 

Marriages 

fo-ft 

ifo -ft^ 

(fo-fty 

ft 

Ob- 

served 

(Jo) 

Theo- 

retical 

(Ji)* 

Cathohc 

Lutheran 

11 

26 7 

-15 7 

246 5 

9 23 

Cathohc 

Catholic 

51 

35 3 

+15 7 

246 5 

6 98 

Lutheran 

Lutheran 

33 

17 3 

+15 7 

246 5 

14 25 

Lutheran 

Cathohc 

7 

22 7 

-15 7 

246 5 

10 86 







41 32 == X® 


* If any theoretical ceE frequency is less than five, a correction is needed. 
See Paul Rider, An Introduction to Modem Statistical Methods^ pp, 112-113, 
John Whey & Sons, Inc , New York, 1939. 


It was seen above that the expected frequencies used in Table 
44 were calculated from the row and column totals of the observed 
frequencies in Table 43. This means that the observed and 
expected frequencies in the cells of Table 44 were to a certain 
extent made to agree. Evidently, this forced agreement should 
be allowed for in testing the amount of difference between the 
two sets of frequencies. In any 2X2 table, like Table 43, 
it is clear that if the row totals, the column totals, and one 
observed cell frequency are given, the other three cell frequencies 
are at once determmed Therefore, only one ceU frequency is 
free of the influence of the margmal totals, so that a 2 X 2 table 
is said to have one degree of freedom.'^ If now the value of x^ 
obtained is referred to a table of x^ such as Appendix Table 2, 
that takes account of degrees of freedom, the spurious resem- 

1 The degrees of freedom for any contingency table are (c — 1) (r — 1), 
where c is the number of columns and r is the number of rows See A. E. 
Treloar, Elements of Statistical Reasoning, pp. 215 and 229, John Wiley & 
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blance between the observed and expected frequencies to which 
we objected above is corrected for. 

Entering Appendix Table 2 with one degree of freedom, then, 
we find that a as large as 6 635 could occur by chance once in 
100 times, the theory involved here being similar to that described 
in the latter part of Sec. 4, below. Our is 41.32, which is 
much larger, and would occur by chance less often than once in 
100 times. Since it is customary to reject chance as the explana- 
tion of an event that can happen by chance no oftener than five 
times in 100, we conclude that the frequency of mixed marriages 
in the community cited is reduced by sociological and perhaps 
economic forces. 

The classic method of introducing the elementary notions of 
probability is to use the illustration of coin tossing The event 
is the occurrence of a ^^head’^ or a “tail.^^ We may toss one 
coin several times, several coins once, or several coins several 
times, as we wish. It is evident that the events are mutually 
• exclusive, as specified in Theorem 2, above. We may also 
assume that all the coins tossed during the experiment are 
exactly alike in size, weight, shape, and balance, ^ e., in respect 
to every fixed or biased factor that affects the tendency of heads 
or tails to faU uppermost when the coin is tossed In this 
way we meet the requirement that each event of a probabihty 
set shall be equally likely. Differently expressed, it is assumed 
that the probability, p, of throwmg a head is the same for every 
penny at each throw, and that every penny at each throw is 
independent of every other penny, i e , there is no tendency for 
one penny to show heads or tails because another does or does 
not, as would happen if they were stuck together. Finally, 
of the two events that can occur, one, heads, we call a success, 
and the other, tails, we call a failure. Having specified these 
conditions, our first question is. What is the probability of 
throwing a head, or of getting a success, at any one toss of a 
penny? In other words, what is the value of p? 

Since in a single toss of one penny there is only one way in 
which a success can occur and one way in which a failure can 

Sons, Inc., New York, 1939 A contingency table is a table of frequencies 
divided according to two or more pnnciples of classification, such as the 
table in Exercise 7 at the end of this chapter. 

1 A ''probabihty set” is described by the denommator of formula (38) 
or (39). 
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occur, aad we assume that the penmes are balanced so that these 
two events are equally likely, we have, in the notation introduced 
above, m = m' == 1. Hence, from formulas (38) and (39), 
p = and substituting p for g or g for p in formula (40), we 
find that p = g == ^ = 0.5. 

Suppose that we throw 10 pennies, and want to know the 
probabihty of getting exactly eight of a kmd, i.e., eight heads or 
eight tails. If eight of 10 pennies show heads, then the other 
two must show tails, or vice versa. We just saw that if we throw 
one penny, the probability of getting a head in one throw is 
p = 5. By Theorem 3, above, the probability of eight successes 
occurring in one possible way is p^ = ( 5)% the probabihty of two 
failures occurring in one possible way is = ( 5)^, and the 
probabihty of these two events occurring together in one possible 
way IS = ( 5)^( 5)^. But the eight heads may occur among 
the 10 pennies in several possible ways, so that by Theorem 2 the 
probabihty of occurrence in just one way should be summed as 
many times as there are possible ways, or, more briefly, multiplied 
by the number of possible ways. How many possible ways are 
there? This is equivalent to asking, In how many ways may we 
get eight heads from 10 pennies, or, how many possible combina- 
tions are there of 10 (= n) things taken eight (= r) at a time? 
To answer this, we already have formula (37) above, which for 
our problem gives^ 

r ^191 _ 10(9) (8) (7) (6) (5) (4) (3) (2) (1) „ 

- 2m ~ (8)(7)(6)(5)(4)(3)(2)(1)(2)(1) ” 

Hence the probability, P, of getting exactly eight heads in a single 
throw of 10 pennies is 

P = nCrP’-g"-", (41) 

or 

p = 45(.5)8(.5)2 = 45(5)1^ 

Using logarithms,^ we find 

log ( 5)10 == 10 log 5 = 10(9 69897 - 10) = 96.98970 - 100. 

^ See Appendix Table 3. For extensive table of factorials or their loga- 
rithms, see T. C Fry, Probabihty and Its Engineering Uses, pp 427-438, D. 
Van Nostrand Company, Inc , New York, 1928. A briefer table is given m 
Mathematical Tables from Handbook of Chemistry and Physics, 5th ed , p. 
180, Chemical Rubber Pubhshmg Company, Cleveland 

2 See Appendix Table 7 and accompanymg Foreword 
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The antilogarithm of this is .0009766. Hence 
P = 45(.0009766) = .044. 

That is, in 44 out of 1,000 trials we would expect by the laws 
of chance to get exactly eight heads in a toss of 10 pennies 
Similarly, by Theorem 2, the probability of getting either eight 
heads or eight tails is 2 X 0.044= 0.088. This last is the proba- 
bility that answers our question. Any similar question can be 
readily answered by substituting in formula (41), above. 

3. The Binomial Distribution. — We often want to know the 
probabihty of getting as many as or more than a specified number 
of successes or failures From what has been said, it will be 
seen that the probability of getting no successes at all in a toss 
of n pennes is g”, of getting one success is of gettmg 

two successes is and so on, and, finally, the probability 

of getting all successes is p”. Since these combinations of events 
exhaust the possibdities, some one of them is certain to occur 
at any toss of n pennies. In other words, the probability of one 
or another of them occurring is unity, or one. By Theorem 2, 
we may therefore write the equation 

g” + »Oipg”“^ + nC'2pV“ + T.CzV^<t~^ + • • • + 

+ •••+?" = !. (42) 

But by formula (40), p -f- g = 1, and hence (p H- g)” = 1. It 
therefore appears that by substitution 


(g -f p)" = g” - 1 - „Cipg”-i -t- -t- • * • -f nC'rp'g”"^ 

+ ■ ■ • + p\ (43) 

If p = g = I-, the formula simplifies to 

(■S' + v)” = (^)“(1 nCl + nCi -t" • • • -[- nCn—l -f" 1) (44) 

This IS the familiar bmomial expansion of algebra, which is now 
seen to be an expression of the operation of the laws of chance!^ 


1 In algebra, the bmomial formula is usually written. 


(g + p)” 




ip -j- 


n(n — 1) 

^ 1^2 ‘ 


n(n — l)(yi — 2) 


12 3 


+ 


+ P% 


and it is pointed out that the exponent of q decreases by 1, while the expo- 
nent of p increases by 1, each term; and that the coefficient of any term, if 
multiphed by the exponent of q and divided by the number of the term, gives 
the coefficient of the next term 
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To discover the probabiKty of getting, say, eight or more heads 
in a single toss of 10 pennies, therefore, we need only apply the 
binomial The probability of getting eight or more heads means, 
specifically, the probability of gettmg eight, nine, or 10 heads; 
and by Theorem 2, this is equal to the sum of the probabilities of 
the three separate events. By formula (41), which is the general 
term of the binomial, the probability of eight heads is loCsp^g^, 
of nine heads is iqCqp\ and of 10 heads is Summing these, 

P - loCspV + loC^p^q + = 45( 5)«( 5)2 + 10(.5)®( 5) 

+ (.5)10 = (.5)10(45 + 10 + 1) = .0440 + .0098 + 0010 

== 0.055. 

Accordingly, in 1,000 throws of 10 pennies, we may expect to get 
eight, nine, or 10 heads about 55 tunes. And the probabihty of 
getting eight or more heads or eight or more tails is, of course, 
2 X .055 = .11, or 11 times in 100 throws. Notice that this is 
merely the most probable number and will vary from one set of 
100 throws of 10 pennies each to another. But in a very large 
number of throws the average proportion should come rather 
close to 11 per 100 throws of 10 pennies each. 

Suppose, again, we throw 10 penmes 150 times. In how 
many of these trials may we expect to get exactly eight of a kind? 
Since we have found this probability to be 0 088, we may expect 
this event in the proportion of about nine times in 100 trials, 
in the long run If N represents the number of trials, and 8 
the number of trials in which the specified event may be expected 
to happen, the formula is approximately 

S = PN, (45) 

Substituting P = 088 and AT = 150 in this formula, we find 
S = .088(150) == 13 2 That is, in 150 tosses of 10 pennies each, 
about 13 IS the most probable number of tosses that will show 
exactly eight heads or eight tails. 

Similarly, if it is wanted to know the frequency with which each 
possible number of successes, from 0 to n, may be expected to 
occur by chance in N trials of n events each, each term of the 
binomial expansion in formula (42) or (43) is simply multiplied 
hjN: 

g^N + nCiptf^^N + nCip^q^-^N + • • • + nCrP^q^^N 

+ • • • + p^N == N. (46) 
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Thus, if we throw 10 pennies 1,000 times, we have 

1,000( 5 + 5)^® = .00098(1,000) + 00977(1,000) + 
04395(1,000) + .11719(1,000) + .20508(1,000) + .24609(1,000) 
+ .20508(1,000) + .11719(1,000) + 04395(1,000) 

+ .00977(1,000) + 00098(1,000), 
or 

.98 + 9 77 + 43 95 + 117.19 + 205 08 + 246 09 + 205 08 

+ 117 19 + 43 95 + 9.77 + 98 = 1,000. 

This is really a binomial frequency distribution^ and is so ar- 
ranged in Table 45. From this table, we see that out of 1,000 
tosses of 10 pennies each, we would expect no heads in only 
about one toss, one head in something like 10 tosses, two heads 
in approximately 44 tosses, and so on. 

Table 45. — Fkeqxjency DiSTRiBimoN op 1,000 Tosses op 10 Pennies 


Number of 

Number of 

Heads (X) 

Tosses (/) 

0 

.98 

1 

9.77 

2 

43 95 

3 

117 19 

4 

205 08 

5 

246 09 

6 

205 08 

7 

117 19 

8 

43 95 

9 

9 77 

10 

98 

Total 

. 1,000 00 


Let us now psiss from the theoretical case of penny tossing to 
some problem that might arise in social research. For example, 
the proportion of males in the urban population of Wisconsin 
in 1930 was pi = 0.4974; in the rural nonfarm population, 
P 2 = 0.5118; and in the rural farm population, pz = 0.5435. 
If we regard the three populations — ^urban, rural nonfarm, and 
rural farm — as ranked in the order of urbanness, and if we sub- 
tract the proportion of males in the less urban from that in the 
more urban of each of the three possible pairings of these popula- 
tions, we get 
^ See Chap V. 
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Pi - = .4974 - .5118 = -.0144. 

Pi - P3 == .4974 - .5435 =r -.0461. 
P2 - p3 = .6118 - .5435 = -.0317. 


We notice that all three of the signs are negative. In the case of 
another Middle Western state taken at random, the same result 
was found. Does this mean that the proportion of males is really 
greater in the more rural populations, or may the negative signs 
in the two states be just a trick of chance? By formula (36), we 


3f 

see that there are JPz ” ^ possible orders of relative magni- 


tude that pi, p2j and pz can take {eg ,pi <pz< p2] P 2 < Pi < ps; 
etc ), if we assume that they are never equal {i e,, pi 9 ^ p 2 7 ^ pz) 
Since the order observed, pi <p% < pzj is only one of the six, the 
probabihty that it will occur in one random trial (or state) is i 
Hence the probability of gettmg only negative signs in both states 
is 


^C 2 {iy{f)^ = ay = — 02s 

by formula (41) 

Statisticians usually insist on odds of at least 6 in 100, or 0 05, 
before they will risk the assumption that a result is not due to 
chance. By this standard, we eliminate chance in the present 
case, and are entitled to conclude that the proportion of males in 
the three populations is related to the degree of urbanness in those 
populations. 

In many situations similar to this, the binomial theorem 
enables us to determine the probabihty that repeated events may 
occur by chance alone, and to note whether or not the probabihty 
is so small that we may reject the hypothesis that chance is 
responsible 

It IS important to ask what is meant by chance in the preceding 
illustration If we regard the census figures for the three popula- 
tions as representing three complete universes, there is no 
question of chance at all Any differences noted in the propor- 
tions of males, however small they may be, are real differences 
between the universes, and that is the end of the matter But 
if we think of the proportion of males in each of our three popula- 
tions as determined by a separate set of causes acting to pro- 
duce sample results, and if we want to know whether or not these 
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three sets of forces differ in any real way from one another, the 
problem of chance at once enters. By chance we mean a great 
number of small, unknown factors actmg in many directions, as 
contrasted with large (biased)^ factors, usually known or know- 
able, acting constantly in the same direction. If the biased factors 
affecting the proportion of males differ from one of the three 
populations to another — e.gr., more females than males migrate 
from rural to urban areas — ^the observed proportions of males will 
differ to a greater extent than can be accounted for by the action 
of small random forces If the biased factors that produce the 
proportion of males in each of the three populations are essentially 
the same, however, any variation in the proportion of males 
from one population to another must be due to chance factors 
alone. It is usually good research method to seek to eliminate 
chance as a possible cause of differences before undertaking to 
discover what factors are responsible. 

If we already know from independent evidence, however, that 
important factors influencing the proportion of males varied 
between the three populations — eg., the two sexes migrated 
unequally from the more-rural to the less-rural areas — there 
would be no point m testing the hypothesis that the differences 
were due to chance, except perhaps to confirm the a priori 
knowledge When such a test fails to ehminate chance, it often 
means only that a larger sample is needed It may sometimes be 
advisable to investigate carefuUy the biased factors in the situa- 
tions imder comparison, even when chance has not been elimi- 
nated as a possible cause of the differences observed between them. 

A binomial distribution, such as that of Table 45, is hke other 
distributions m having a mean, a standard deviation, and other 
statistical constants by which it may be described. The formulas 
for the mean and the standard deviation are 

Mb = np, 

ffs = 

where the symbols have the same meanings as above 

For t he distribu tion of Table 45, Mb = 10(.5) == 5 heads, and 
(tb = VlO( 5) (.5) — 1 58 heads. 

^ See also third paragraph on p. 149, above 

^ For a derivation of these formulas see, for example, C H Richardson, 
An Introduction to Statistical Analysis, pp. 228-230, Harcourt, Brace and 
Company, Inc , New York, 1934. The subscript, B, means binomial 


(47) 

( 48 ) 2 
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It is not necessary in chance situations that p and q should be 
equal Thus, the probability of throwing an ace in a single toss 
of a (he is p == -|, and the probability of not throwing an ace is 
g — If 15 throws are to be made, the binomial is 
and this can be expanded and utilized Just as was done above 
for p = g == I-. T^en p == g, the bmomial is symmetrical in 
shape, when p ^ g^, it is asymmetrical or skewed. 

4. The Normal Distribution. — Graphs of the binomials 
32(^ + lY and 1,024(1- + are shown in Fig 38 Notice that 



Fia. 38, — Histograms of binomials, + as n increases. 



they take the form of histograms rather than of smooth curves, 
because successes are counted only in whole numbers, yielding a 
discrete or discontinuous series. However, if the length of the 
scale is kept constant, as in the figure, the graph of the binomial 
1,024(^ + is seen to be less broken in outline than is that of the 
binomial 32 (-^ + 1)^. As increases, the graph approaches closer 
and closer to a smooth curve in appearance. If now n is indefi- 
1 ^ means greater or less than 
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nitely increased, giving the binomial iV(^ + t)“ it is evident that 
the intervals of the graph become smaller and smaller, until 
in effect the outline merges mto that of a smooth curve. The 
resulting curve is the most important type of distribution in 
statistical theory, and is known variously as the normal curve, the 
Gaussian curve, the curve of error, or the curve of jrrohaUhties. 
Unlike the binomial distribution, it represents a contmuous 
variable, which can take any value whatever, on the X scale. 
A graph of the normal curve is shown m Fig. 39. It may be 
thought of as enclosmg a contmuous surface, cut from a piece of 
thm sheet metal. Its equation is usually written 


y = 


N 

<Tx 




( 49 ) 


where rr = X — ikf , or a mean deviate of X, 

X = total frequency of the distnbution, 

TT = 3 1416, so that \/^ = 2 5066, 
e = 2 7183, the base of natural logarithms. 

If the area of the curve is taken as unity, equation (49) becomes 

y- ( 50 ) 

O-a- V 


As an aid to understanding the curve represented by equation 
(50), let us analyze its equation We shall begin by lettmg 


— = i, so that equation (50) becomes 

CTj. 


y = 



( 51 ) 


In the calculation of tables of normal ordinates, it is also con- 
venient to let (Ts = 1, givmg 


y = 



(52) 


But, as seen above, x is a mathematical constant with the value 
3.1416, so that = 2.5066, and = .3989 Equation 

(52) may therefore be written 
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In Fig. 39, at the mean of the X’s on the X axis, a: = 0. The 
height of the ordinate at any point is the value of y found from 
equation (53) by substituting the appropriate value of t. At 
X = 0,t = x/(Tx = O/ffa = 0, and 

-m 

y = 3989e 2 
y = 39896“. 

But any number raised to the zero power is 1 *, so that 
y = 3989(1) = 3989. 

In other words, at the mean of the X’s, the height of the ordinate, 
y, is .3989, for any normal curve of umt area and unit standard 
deviation. This is plotted in Fig. 39. 

Next, for the same case, let t = x/<rx = +2. Then, by 
formula (53), 

-( 2 )» 

y = .39896 2 ^ 
y = 39896-*, 
y = .39896-2. 


3989 
6 ^ 

It has also been seen that e, like tt, is a mathematical constant, 
having the value 2.7183, so that 6^ = 7.38906. Hence, at 


But 

so that 


t 



.3989 
^ 7 38906 


05399. 


This value is also plotted in Fig. 39. Notice that at — == — 2, the 

cr aj 

value of 2/ is the same as at — = +2, for in formula (53) evi- 

O'a? 

dently is the same as The normal curve is thus 

symmetrical, ie , of the same shape, on each side of the mean. 
From this it follows that the mean and the median coincide. 

* See any text in elementary algebra. 
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The student is asked to check the values of y found above at 
x/cTs, = 0 and at x/a-^ = ±2 against those printed in Appendix 
Table 1. All the values in that table are calculated in this way, 
and may be used to complete the construction of Fig 39. Thus, 
the height of the ordinates at ± Icr, read from the table, is .2420, 
and is so scaled in the figure. After several ordinates have been 
drawn, they are connected by a smooth line, to form the curve 
shown. 

The tallest ordinate of the normal curve occurs at the mean, 
hence the mean, median, and mode all coincide This appears 
from the fact that when x — 0, y == .3989; whereas, when 

tc ^ 0, y = .3989/e The latter term is always smaller than 
the former, since all positive powers of e are greater than 
l(e<^ - 1 ). 

Another characteristic of the normal curve is that it is asymp- 
totic to the X axis, meaning that the curve constantly approaches 
but never touches the X axis as it extends indefinitely in both 
directions from the mean. 

Table 46 shows a hypothetical normal distribution with 
perfectly symmetrical frequencies The actual frequencies of 
normal tables may depart in various degrees from this sym- 
metrical pattern, because of samphng errors or the use of class 
intervals that do not place the mean of the series exactly at the 
center of the distribution. 

Table 46 — Normal Distrlbittion op Scores on an Army Attitudes Test 
(Hypothetical Data) 


Scores 

Men 

(X) 

(/) 

0-4 9 

5 

5-9 9 

17 

10-14 9 

44 

15-19 9 

92 

20-24 9 

150 

25-29 9 

191 

30-34 9 

191 

35-39 9 

150 

40-44 9 

92 

45-49 9 

44 

50-54 9 

17 

55-59 9 

5 

Total 

998 
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With the help of the integral calculus, it is possible to find the 
proportion of the area under any part of the normal curve, ix , 
between the ordinates erected at any two points on the x scale. 
This has been done for the areas between the ordinate at the 
mean and ordinates erected at mtervals of .Olcr along the x-axis. 
The results are shown in Appendix Table 1, in the column 
headed ^'Area.” Thus the area under the curve between the 
ordinate at the mean and the ordmate at la- is seen to be 0 34, or 
34 per cent (roughly one-third) of the total area under the curve. 
In Chap. VI we saw that the area under a frequency histogram, 
where the width of the interval is taken as one umt, is equal to 
the total frequency of the distribution. The same principle holds 
for the normal curve. 

Since the normal curve represents the distribution of frequen- 
cies in any normal universe, the proportion of the area between 
the ordinate at the mean and the ordinates at, say, x — ±l(r 
represents the most probable proportion of the frequencies of 
any random sample drawn from such a umverse that may be 

iJC 2 / 

expected to fall between the values - == 0 and - = ±1. Dif- 

C O’ 

ferently expressed, the proportion of the area between the 
ordinate at the mean and the ordinates at a; = ± lo- is the proba- 
hihty that a random sample value of X will fall between and 
±lcr. We see from Appendix Table 1 that this probabihty is 
twice 0 34, which is approximately 0 68, or 68 per cent. It 
should now be clear why in a normal distnbution the odds are 
about two to one that a random value of X will be within a 
range of one standard deviation on each side of the mean value of 
X. Also, inasmuch as a value of X falls outside the range of 
Afx ± 2a- by chance only 1.00 — (2 X 0 477) = 0 046, or about one 
time in 20, we shall be fairly safe if we attribute those values that 
do so to somethmg else than chance In other words, we shall 
arbitrarily regard all such extreme values as significant 

Eeadmg again from Appendix Table 1, it is seen that approxi- 
mately 25 per cent of the area of the normal curve lies between 
the mean and an ordinate at rr = 0 67a-. That is, one-half of the 
area of the curve is included between an ordinate at —0 670* and 
an ordmate at +0 67a-. From a finer table, the figure is found 
more exactly to be 0 6745a- The distance 0 6745a- from the 
mean along the x-a,ids of the normal curve is commonly called the 
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probable error (P.E.), and is often used instead of the standard 
deviation, a, or standard error, as it is called in sampUng theory 
(see Chap XII). 

The relationships of the preceding paragraphs do not hold, 
however, for skewed distributions. This may be seen from Kg 
41. By companng the rectangles in the areas ilf - 1<7 and 
M + l<r, it is clear that in this case a much larger proportion 
of the area of the curve is contained between M — \c than 
between M + l<r, so that the standard deviation has no constant 
relation to the area or frequency. For this reason, the standard 
deviation has a variable meaning when applied to as3/mmetrical 
distributions, and should be cautiously interpreted in such eases. 
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ard deviation and area under normal standard deviation and area under 

skewed curve. 


In a normal distribution, A D. = 80v, so that the distance 
^ scale includes about 58 per cent of the fre- 

quencies (see Appendix Table 1). 

When n is large, the labor of expanding the binomial becomes 
excessive. Under these conditions, if the value of np or nq is not 
too small, say 5 or more, the binomial so closely approximates the 
normal curve that the latter may be used in its stead for purposes 
of estimation, and the desired probabihties simply read from 
Appendix Table 1. 

Consider again the probability of gettmg eight or more heads 
or tails in a toss of only 10 pennies. In Fig. 39 we erect a per- 
pendicular at the point 

H ^ ^ X - np ^ 8 - 10(.5) ^ 

°'B s/n^ •\/l0( 5)( 5) ' ’ 

and find the area between this ordinate and the ordinate at the 
mean. Entering Appendix Table 1 with xl<r = 1.90, we see 
that the area desired is 0.4713 of the area of the whole curve. 
Subtracting 0 4713 from 0 5000, the area of half the curve, we 
get 0.0287 as the area to the right of the ordinate at x/<r = -{-1.9. 


162 


ELEMENTARY EOCIAL BTATISTICB 


Since the normal curve is assunaed to represent the results of all 
possible tosses of 10 pennies, the area to the right of x/cr = +1.9 
shows the proportion of tosses that m the long run may be 
expected to give eight or more heads This proportion is the 
prohaUUty of gettmg eight or more heads in one toss of 10 pennies, 
so the probability of gettmg eight or more heads or eight or more 
tails is twice this, or P = 2(0.0287) = 0.0574. The true value 
of P as found above from the bmomial expansion is P = 0 1094. 
The agreement is thus seen to be none too good when n is as small 
as 10 If n is increased to 15, however, np = 7.5, and we jfind 
more agreement. The probability of gettmg say 12 or more 
heads or tails is 0 0204 according to the normal curve, and 0.0176 
according to the binomial,^ the error being only 0.0028. For 
larger values of n, the two estimates may for most purposes be 
accepted as equivalent. 

The approximate probability of getting exactly eight heads or 
eight tails in a toss of n = 10 pennies is the height of the ordmate 
of the normal curve at the point X = 8, expressed m standard 
deviation units This is because the number 8 is represented 
on the X scale by a point rather than by a distance, and on this 
point can be erected only a straight line, or ordinate, which 
theoretically has no width and hence no area. We now need 

— at X = 8, i e , at ~ == =19. From Appendix 

^ Vl0(0 5)(0 5) ^ 

Table 1 we find 2 / = 0 0656, so that — = — — = 0.0415, 

o-o. Vl0(0 5)(0 5) 

and 2 X 0.0415 = 0 083 is the probabihty desired The correct 
probability already found by the binomial is 0 0879. 

If we choose to consider the normal curve merely as a device for 
approximating the probabihties of the binomial, rather than as a 
continuous mathematical distribution, it becomes possible to 
take certain hberties with it that will improve its accuracy for the 
purpose. For example, to determine the probability of throwing 
eight or more heads or tails in a toss of 10 penmes, we may allow 
the value X = 8 to occupy the area under the normal curve 
between the X values 7 5 and 8 5, and regard the area to the 
right of 7 5 as representing the probabihty of throwing eight or 
more heads. We may then erect a perpendicular in Fig 39 

^ + uCupiV + i^Ci4P^^q + = (i) 15(455 + 105 + 15 + 1) 

= 0 01758. 
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at the point 

X ^ X -np ^ 7.5 - 10(05) _ 

° ^/ri^ \/l0(0.5)(0 5) ' ’ 

and find the area between this ordinate and the ordinate at the 
mean to be 0 4429 (Appendix Table 1). The area to the right 
of the ordmate at 1 58(7 is, therefore, 0.5000 — 0.4429 = 0.0571, 
which is the probabihty of throwing eight or more heads. The 
probabihty of throwing eight or more heads or eight or more 
tads is 2 X 0.0571 = 0.1142. This result is much closer to the 
correct bmomial probability of 0 1094 than was that obtained 
above in the orthodox way. Indeed, the accuracy of the normal 
curve m approximating the binomial has now been made quite 
satisfactory even for n = 10. 

It IS also possible to use a similar manipulation in estimatmg 
the probability of throwing exactly eight heads or eight tads in a 
toss of 10 pennies. We find from Appendix Table 1 the area 
under the curve included between an ordinate at X = 7 5 and 
an ordinate at Z = 8 5. The table gives 0 4864 as the area 
between the mean ordinate and the ordinate at Z = 8 5 (t e , at 

® ~ -^ 10(0 ~ 2.21cr), and 0 4429 as the area between 

the mean ordinate and the ordinate at Z = 7 5 (t e , at 


75- 10(05). 

V10(0 5)(0 5) ^ 


Consequently, the area between the ordinate at Z = 75 and 
the ordmate at Z = 8 5 is 


0.4864 - 0 4429 = 0 0435. 

This is the probability of throwing exactly eight heads in a toss 
of 10 pennies; so the probabihty of throwmg exactly eight heads 
or exactly eight tads is 2 X 0.0435 = 0 0870 The error from 
the binomial (0.0879) in this case is neghgible 
It should be noted that special modifications like those above 
in the use of the normal curve are usually worth while only when 
np is small, say np < 6. 

6. Skewness and Kurtosis. — The frequency distributions 
with which social scientists have to deal usually depart con- 
siderably from the normal form. Such a distribution is shown 
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in Table 47 and in Fig 42, It is readily seen to extend farther 
in the positive direction from the mean than in the negative 


Y 



direction; and so is said to be positively shewed. If there is 
occasion to measure the amount of the skewness, an index is 

Table 47 — Relative Notbers op Divorced Couples by Years Married 


Years 

married (X) 

Divorced 
couples (/) 

Accumulated 

frequency 

(M) 9 

15 

15 

1 0-1 9 

72 

87 

2 0-2 9 

60 

147 

3 0-3 9 

43 

190 

4 0-4 9 

21 

211 

5 0-5 9 

17 

228 

6 0-6 9 

9 

237 

7 0-7 9 

8 

245 

8 0-8 9 

5 

250 

9 0-9 9 

2 

252 

Total., . . 

252 



provided by formula (54) : 

cr 


(54)^ 


^ We saw m Chap VII that the value of the mean, ilf , is mfluenced by 
extreme values, and hence by skewness, but that the value of the mode, Mo, 
IS not affected In the present chapter it was learned that m a normal dis- 
tribution the mean, mode, and median aU have the same value These 
facts suggest as an approximate measure of absolute skewness, Sk, the 
difference 

Sh^M - Mo, 

To change this to generalized units, we may write 

SI, = 

cr 

Because the value of the mode can seldom be accurately determined, how- 


(55) 

(56) 
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The values of Sk by this formula vary betweea ±3, but values 
larger than ±1 do not often occur. If there is no skewness, 
Sk==0 

A more useful measure of skewness for some purposes is gi, 
which for large samples is approximately 


9x = 5- (58)^ 

j /3 is the third moment about the mean of the distribution, defined 
by the equation vz = 'Ljx^/N, where x is a mean deviate as 
usual 

For a normal distribution gx = 0. For other values of gx the 
sign indicates the direction of the skewness. Values of gx as 
great as ±2 mean decided skewness. 

A frequency distribution may also depart from the normal in 
height or “ peakedness. This is called kurtosts If the observed 
distribution is flatter than the normal, it is said to be platykurtic; 
if more peaked, leptokurtic, if neither, mesokurHc Kurtosis may 
be measured by g% For large samples, an approximate formula is 

5 - 3 (59) 

Vi is the fourth moment, Xfx^/N, of the distribution, and is 
the second moment, vz = — I^fx^/N, squared. 

g 2 also is zero for a normal distribution. A positive value of 
gz indicates that the observed distribution is more peaked than 
the normal, and a negative value indicates that it is flatter. 


ever, it is considered preferable to replace it by its equivalent m terms of 
the median, Md In any moderatey skewed distribution, the median falls 
about two-thirds of the distance from the mode to the mean (see Chap. VII, 
Fig 31) We therefore have 

ikfo - ikf - 3(M - Md) (57) 


Substitutmg this value of the mode in formula (56), 


Sk 

Sk 


M -[M - 3(M " Md)] ^ 
tr 

3(ilf - Md) 


( 54 ) 


* p is the lower-case Greek letter nu 
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Before Formula (58) or (59) can conveniently be appbed to a 
distribution bke that of Table 47, some short-cut calculating 
formulas are needed: 

(S/^^j 


V2 


N 




N 


(60) 


"3 - jy 


Vi = 


N 


S/d* - J S/dS/d* -h 


|5(W_ 


S/d^ - ^ 2/d*S/d 4- EfdWdY - imr 


(61) 

(62) 


where i = width of class interval. 

d = unit step deviation from an assumed mean. 

IV = 2/ 

Notice that formulas (61) and (62) are merely extensions of the 
familiar short method of finding a standard deviation by the use 
of an assumed mean and umt step intervals. This appears 
clearly in Table 48, below. 

Let us now measure the skewness and kurtosis of the distribu- 
tion shown in Table 47, by comparing it with the normal curve. 
We set up the computing table: 


Table 48 — Computing Table foe Moments' Data op Table 47 


Years married 

/ 

d 

fd 

fd^ 

fd^ 

fd* 

0-0 9 

15 

-2 

- 30 

60 

- 120 

240 

10-19 

72 

-1 

- 72 

72 

- 72 

72 

2 0-2 9 

60 

0 

0 

0 

0 

0 

3 0-3 9 

43 

+1 

-f 43 

43 

43 

43 

4 0-4 9 

21 j 

+2 

42 

84 

168 

336 

5 0-5 9 

17 

+3 

+ 51 

153 

459 

1,377 

6 0-6 9 

9 

+4 

+ 36 

144 

576 

2,304 

7 0-6 9 

8 

+5 

+ 40 

200 

1,000 

5,000 

8 0-8 9 

5 

+6 

+ 30 

180 

1,080 

6,480 

9 0-9 9 

2 

+7 

+ 14 

98 

686 

4,802 

Total 

252 


154 

1,034 

3,820 

20,664 


Recalling the short formula for the mean. 


M = A + 


i'Efd 

~ir’ 


where A is the assumed mean, we find for this table, 
M = 2.5 + KM) = 3 11. 
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Substituting in the formula for the median, 

where the symbols have the meanings explained in Chap. VII. 
We find 

2.0 + (1) = 2.65. 


Md 


60 


For the standard deviation, we have 


_ . /s/dV 

W JV \N )’ 

<r = 1 vw - 

<r = 193 

Hence, according to formula (54), we find the skewness to be 

Si . . 0.72. 


This shows considerable skewness in the positive direction. 

Let us next measure the amount of skewness in Table 48 by 
the use of formula (58). From formulas (60) and (61) we find 

<7* = (1 93)* = 3 73, 


- 2^ [^**0 - A ^ (154)*] - 8.09. 

Substituting in formula (58), 


8.09 

(1.93)* 


1 13. 


This result agrees with thau obtained by formula (54), m 
showing positive skewness. 

We shall now measure the degree of kurtosis, if any, exhibited 
by the distnbution of Table 48, through the use of formula (59) 
We need only one new value, vi, which may be found by formula 
(62). 

i ^ (3,820)(154) + (1,034)(154)* 

“ 12 ^ (154)* j, 


13,528 
~ 252 


53.7. 
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Substituting in formula (59), 


9 ^ = 


53.70 


-300 


( 3 . 73)2 
^2 = 3 86 - 3.00 = 0.86 


The value of is positive, so we conclude that the obsei ved dis- 
tribution is leptokurtic, or more peaked than a normal curve ^ 
Even though a sample distribution is found by the above 
methods to differ from the normal, the question arises whether or 
not the difference is one that might be due merely to random 
errors of sampling. This point is dealt with in Chap XIII. 


Exercises 

1. Twelve children are to be used in the experimental study of donu- 
nating and submissive types of behavior. Each child is to be grouped 
(a) with one other child, (&) with two other children What is the total 
possible number of such experimental groups of each size? 

2. Four villages, five cities, and five rural counties are to be grouped 
in all possible combinations of five. No distinction is made between 
areas of the same type, z.e,, one village is the equivalent of another 
village. "What is the total number of combinations^ Describe them. 

3. The types of contact between famihes in a commumty are hsted 
as: visit, church, lodge, school, business, and other But any or all 
of these contacts may appear together, as weU as separately. How 
many combinations of all kmds are there between these several types 
of contact? 

4. The educational levels of a sample of husbands and wives are 
recorded as college, high school, grades, and lUiterate What is the 
total number of possible permutations of husband-wife relationships 
in terms of these levels, and what are they? 

6. How many mamages are possible between three pairs of brothers 
and sisters m our society? 

^ Another measure of kurtosis that is more commonly used than ^2 is ^ 2 : 

^ (63) 

For Table 48, above, ^2 = 3 86 Smce m a normal distribution ^2 ” 3, 
the observed distribution is again seen to be leptokurtic 
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6. In an experiment with four pairs of subjects, each pair consists of 
a male and a female, closely matched^' m respect to certain sociological 
characteristics. They are to be given a test while seated around a table 
in such a way that the sexes alternate, and no members of a matched 
pair sit next to each other In how many ways may this be done? 

Note The number of different permutations of n things taken n at 
tune when arranged m a circle is given by the formula (ti — 1)! 

7. Gist and Clark give the following table: 


DlSTBIBUnON OF INTELLIGENCE SCOEES OF 2,544 (KaNSAS) EuEAL HigH- 
SCHOOL Students in 1923, According to Present Rural and 
Urban Classification* 


LQ. 

Urban 

Rural 

Total 

Under 95 . . • ... ' 

378 

832 

1,210 

95-104 

326 

472 

798 

105 and over . 

260 

276 

536 

Total 

964 ' 

1,580 

2,544 


* American Journal of Sociology , July, 1938, p 43 


Compare the observed frequencies with those expected by chance alone, 
apply the ^^est, and comment on the results. 

8. Classification of many cases shows that the probabihty of a mar- 
riage ending in divorce under certain conditions is 0 20. In a sample of 
20 such marriages, what is the probabihty that there will be no divorce? 
What IS the probabihty that there wfil be no more than two divorces? 
Compare the results from the binomial with those from the normal 
curve 

9. In Exercise 8, if many random samples of 20 marriages each were 
taken from the type of marriage referred to, (a) What Tnean number 
of marriages per sample would be expected to end in divorce? (5) What 
would be the standard deviation of the numbers of marriages ending in 
divorce found from many samples? 

10. Calculate skewness and kurtosis for the distributions below: 


Failures on Parole in 50 Subsamples of Five Prisoners Each 
Failures Frequency 

0 1 

1 10 

2 17 

3 15 

4 7 

5 ^ 

Total 50 
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Persons 

Families by Size 

Frequencies 

1 . 


24 

2 


70 

3 . 

. 

62 

4 


52 

5 


36 

6 


23 

7 

. 

14 

8 

, . 

8 

9 


5 

10. 

. 

3 

11 


1 

12 or more 

. . 

1 

Total 

. . 

. . 299 


Families Classified by Age of Mai^* Head 
Age, years Frequency 


Under 25 

, 

13 

25-34 


. 59 

35-44 


71 

45-54 


57 

55-64 


37 

65-74 


19 

75 and over 

. 

6 

Total . 
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CHAPTER X 


GROSS RELATIONSHIP BETWEEN TWO FACTORS; 
SIMPLE LINEAR QUANTITATIVE CORRELATION 

One of the most common purposes of social research is to dis- 
cover whether or not there is any relationship between two 
factors, and to measure the amount of the relationship. For 
example, does the number of children in a family tend to decrease 
as the family income increases? If treated statistically, this 
kind of question is called a problem in correlation. As will be 
seen below, statistics is able to measure the amount of relation- 
ship (correlation) present in such cases, to provide an equation 
by which one of the factors can be predicted from a knowledge 
of the other, and to estimate the range of error m the predictions. 

1. The Scatter Diagram: Ungrouped Data. — As an introduc- 
tion to the method of simple Imear correlation applied to un- 
grouped data, let us test the idea that the largest percentage 
increases of population in the Umted States between 1920 and 
1930 occurred in regions where the density of population per 
square mile was least in 1920. We shall limi t ourselves here to 
examimng the amount of correlation in the nine census divisions. 
The necessary figures are given in Table 49. 

Table 49 — ^Percentage oe Population Increase, 1920-1930 (F), in 
Relation to Population per Square Mile in 1920 {X), bt 
Geographic Divisions, United States* 


Division 

X 

F 

New England 

119 

10 

Middle Atlantic 

223 

18 

East North. Central 

88 

18 

West North Central 

25 

6 

South Atlantic 

52 

13 

East South Central 

50 

11 

West South Central 

24 

19 

Mountain . , 

4 i 

11 

Pacific 

! 

18 

47 


* From Abstract of the Fifteenth Census of the Umted States, 1930, pp 12-13. 
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We may make a preliminary judgment by rough methods as to 
whether or not any relationship is present between the X and Y 
series. Taking the four largest values of X, we find the average 
of the four corresponding Y values to be 14 76 For the four 
smallest values of X, the average Y value is 20.75. In other 
words, as the X values decrease, the F values tend to increase, 
on the average. This suggests that there is some negative 
relationship between the two series. 

A better way of prejudging correlation is by means of a 
scatter diagram. The X and F values are plotted on rectangular 
coordinate paper, as shown in Fig. 43.^ It is now seen that if^ 



Pig, 43. — Scatter diagram for Table 49. 


the point for the Pacific region is omitted, the remaining points 
show no discermble tendency either to rise or to fall across the 
table Any correlation present must, therefore, be due to a 
smgle case. It would be misleading to say that between 1920 
and 1930 there was a tendency for population in the United 
States to increase at a faster rate in thinly populated regions than 
in thickly populated regions, when as a matter of fact this was 
true in only one out of nine regions. There is accordingly no 
point in going any further with this problem, unless we wish to 
try areas smaller than census divisions. 

Consider a second problem. Do the counties of Wisconsin 
that have high birth rates also tend to have high death rates? 
Waiving the objections that a county is not always a homo- 

1 For example, the first pair of values constitute a pomt with the coordi- 
nates (119, 10) To plot this pomt in Fig 43, after drawing the horizontal X 
axis and the Y axis perpendicular to it, we naeasure 119 units from the ongm 
at 0 along the X axis, then up 10 F units parallel to the Y axis, and there 
mark m the point. 
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geneous unit {eg , & county may be half urban and half rural)^ 
and that its population is often too small to 3 deld reliable birth 
and death rates, let us compare the first 20 counties of the state, 
taken alphabetically, in 1935. The data are in Table 50. 


Table 50. — Bibth and Death Bates bt Counties in Wisconsin, 1935* 


County 

Birth 
rate (X) 

Death 
rate (F) 

XY 

X* 

72 

Adams . . 

18 6 

9 7 

180 42 

345 96 

94 09 

Ashland 

22 2 

12 0 

266 40 

492 84 

144 00 

Barron 

18 4 

10 4 

191 36 

338 56 

108 16 

Bayfield 

12 5 

8 3 

103 75 

156 25 

68 89 

Brown 

22 1 

11 6 

1 256 36 

488 41 

134 56 

Buffalo 

17 5 

6 9 

120 75 

306 25 

47 61 

Burnett . - 

17 2 

10 3 

177 16 

295 84 

106 09 

Calumet 

15 7 

6 8 

106 76 

246 49 

46 24 

Chippewa 

20 5 

12 1 

248 05 

420 25 

146 41 

Clark 

17 3 

7 4 

128 02 

299 29 

54 76 

Columbia 

17 4 

13 9 

241 86 

302 76 

193 21 

Crawford 

22 5 

10 1 

227 25 

506 25 

102 01 

Dane 

17 1 

13 8 

235 98 

292 41 

190 44 

Dodge 

14 4 

9 2 

132 48 

207 36 

84 64 

Door 

20 8 

9 8 

203 84 

432 64 

96 04 

Douglas 

16 2 

12 2 

197 64 

262 44 

148 84 

Dunn 

18 7 

9 3 

173 91 

349 69 

86 49 

Eau Claire 

22 0 

12 2 

268 40 

484 00 

148 84 

Florence 

17 8 

10 5 

186 90 

316 84 

110 25 

Fond du Lac 

17 3 

11 1 

192 03 

299 29 

123 21 

Total 

366 2 

207 6 

3,839 32 

6,843 82 

2,234 78 


* From Be'port of the State Board of Health, Wisconsm, 1934-1935, p 210. 




366 2 
20 


= 18.31 


My 


207 6 
20 


= 10.38 


We shall apply the device of the scatter diagram to these 
figures. The results are shown in Fig. 44. 

From Fig 44, we notice first that the range taken by the 
points is limited, none falhng below 12 or above 23 on the X 
scale, and none below 6 or above 14 on the Y scale It is a 
general precaution that as a rule any correlation found for a 
given set of data should not be assumed to exist outside the 
range of the data A man may accept a wage of 50 cents an 
hour to work eight hours or perhaps even 12 hours without 
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resting, but it would be erroneous to suppose from this that he 
would continue to work an mdefinite number of hours at that 
rate. After 12 or 14 hours, it would probably require more than 
50 cents to mduce him to work another hour Thus the relation- 
ship between wages (X) and length of work period (F) would 
not be the same beyond the range of 12 hours as within that 
range. Similarly, counties with birth rates much below 12 or 
above 23 might show death rates entirely out of line with what 
would be expected from the relationship found between birth 
and death rates in the counties included in the study. 

Y 
16 
14 
12 
10 
8 
6 
4 
2 

1, I I L_J ^ I ^ I ^ 1 L_^ 

0 2 4 6 8 10 12 14 16 18 20 22 24 26 

Fig 44 — Scatter diagram for Table 50. 

A second fact shown by Fig. 44 is that there is a general 
tendency for the pomts to rise in the positive direction along the 
X scale. That is, as the birth rates in the counties increase, the 
death rates tend to increase also. This indicates that there is 
some positive correlation between the two kinds of rates that 
seems worthy of further investigation We would not expect a 
high correlation, however, because the dots show considerable 
scatter, instead of following one another in a continuous hne 
or curve. 

It should be pointed out that if the data in Fig 44 had fallen 
instead of rising in the positive direction along the X scale, a 
negative relationship would have been indicated. That is, 
there would have been a tendency for the death rates to decline 
as the birth rates increased A negative correlation, of course, 
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shows just as much relationship as a positive correlation of the 
same degree 

2. The Line of Regression : Ungrouped Data. — In simple cor- 
relation it is customary, whenever reasonable, to regard one of 
the factors, Z, as an mdependent factor, and the other, F, as a 
dependent factor. Thus, above, the birth rate is taken as the 
independent factor, X, and the death rate as the dependent 
factor, F, because the birth rate is beheved to influence the 
death rate, rather than vice versa. 

Returning to Fig. 44, the next step in the attempt to measure 
the amount of correlation between the X and F factors is to 
ask what is the form of the observed correlation. From inspec- 
tion of the figure, it appears that the simplest way to represent 
the relationship is by means of a straight hne This is fortunate, 
because the method of simple correlation that is described m 
this chapter deals only with straight-line, or linear^ relationships. 
Relationships that take the form of curved hnes are measured 
' by other methods When it seems advisable to use a formal 
mathematical test to determine y 
whether or not a relationship is hnear, 
the description of such a test may be 
found in more advanced texts ^ 

Although, of course, no one line 
will fit all the points in Fig. 44, math- 
ematics furnishes a formula for deter- ^ 45 -Geometric' meaf- 

mining the hne of best fit, which is mg of the equation of a straight 
usually called the hne of regression 
of F on Z. The general equation of a straight hne is 

Fc = Oyx + byxX, (65) 

where a is the intercept of the line on the F axis, and 6 is the slope 
of the hne with respect to the Z axis, or the ratio of c to d in 
Fig 45. (This follows from the argument that at any point, P , 

on the Ime, F == a + c; but by definition 6 or c = 6Z; 

therefore, Y == a + bX) 

To determine the values of the constants, a and b, that will 
give the line of best fit, the following normal equations are used: 

1 G U. Yule, and M. G. Kendall, An Irdroduction to the Theory of 
Statisticsj pp 455-456, Charles Gnffin & Company, Ltd , London, 1937. 
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^ _ SZF - NM,My __ N2XY - SZSF_ ^xy 

Oyz 2 Z* _ - ^2Z2 - (SZ)2 Sa;2’ 

Ctya; Xy 

where the subscripts yx indicate the regression of Y on X. 

From Table 50, we substitute in formula (66) : 

, _ 3839.32 - 20(18.31) (10 38) 

6843.82 - 20(18.31)2 ' 

hy:, = .27516, t 

ay:, = 10.38 - .27516(18 31) == 5 34182. 

Substituting these values of a and h in formula (65), 

Fc = 5 3418 + .27516Z. (68) 

Putting X = 12.5 in formula (68), we have 

Fc = 5 3418 + 27516(12 5). 

Fc = 8 78130. 

Letting X = 22, 

Fc = 11 39534. 

Plotting these two calculated points, (12 5, 8 78) and (22, 
11.395), m Fig. 44, we get the Ime of regression of F on Z there 
shown If Z = 0, Fc = 5 34 = a 

If the origin is shifted to the means of the two series, J (Fig 46), 
equation (65) becomes 

Vc == lx, (69) 

where x and y are deviates from their respective means. For the 

* Also, see formula (88). 

t These figures are carried to several decimal places to provide a check 
in the summation of the third column of Table 51 If the work has been 
correctly done, this column will sum approximately to zero 

t Notice that the mean of the Y, values calculated from the regression 
equation is equal to the mean of the observed Y values This may be shown 
algebraically by replacmg a m equation (65), above, with its equivalent 
from equation (67) : 

Yc ^ (J>yx "b 5j/xZ, 

Fc “ ATy *“ 5j/xAr» "b 5j/xZ^, 

= My — hyxMx “b lyxMx* 


(66)* 

(67) 


If the second equation above is expressed m terms of mean deviates, we get 
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present problem, this gives 

yc = .27516X (70) 

which is a simpler equation and often easier to handle than 
equation (68). The yc values calculated from this equation, 


Y 



Fig. 46. — Shift of axes necessary to change regression hae to mean deviate form 

iVc - hx) 

» however, are of course not directly comparable with the observed 
F’s. For that reason equation (68) is used to provide the values 
in Table 51. 

A measure of the goodness of fit of the regression line 
Ye = 5.34 + .275Z to the points in Fig. 44 is given by the 


(Yc - Mu) = (Mu - Mu) - hu.(M. - M:c) + - M:c); 

(Yc - Mu) = hu,(X - M^) 
or 

Ve = hyxX* 

Subtracting My from each Y value and Mx from each X value in equation 
(65) IS equivalent to measuring all Y values from the mean of the F’s, and 
all X values from the mean of the X’s. That is, the Y axis in Fig 46 is 
simply moved to the right to the mean of the X^s, and the X axis is moved 
up a distance equal to the mean of the F^s- This, of course, places the 
mtersection of the two new axes at a point which has for its coordinates 
the means of the two senes (Mx, My) Smce this point is the ongtn of the 
system of axes from which all values of X and F are to be measured, however, 
it IS convement to give it the coordinates (0, 0). This is also necessary if we 
express x and y m mean deviate form as iu equation (69), because at the 
pomt of the means the value of every mean deviate must be zero. 

It follows from the second equation above that the regression hue always 
passes through the pomt (Mx, My)j since, if we let X = ikfx, 

Ye ” My “ t)uxMx “f* t}uxMxt Yc “ My 
The same fact appears if we let a: = 0 in equation (69). ye “ &y*(0), Vc = 0, 
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formula for the standard error of estimatej Sy: 



where d is the difference between the observed and the calculated 
Y values, and N is the number of paired values The d^s are 
shown in Table 51: 


Table 51 — ^Values of d and 


Observed 

(. 7 ) 

Calculated 

(. Yc ) 

d 

d ^ 

9 7 

10 45980 

- 75980 

57730 

12 0 

11 45037 

+ 54963 

30209 

10 4 

10 40476 

-- 00476 

00002 

8 3 

8 78132 

- 48132 

23167 

11 6 

11 42286 

+ 17714 

03138 

6 9 I 

10 15712 

-3 25712 

10 60883 

10 3 ; 

10 07457 

+ 22543 

05081 

6 8 i 

9 66183 

-2 86183 

8 19007 

12 1 

10 98260 

+1 11740 

1 24858 

7 4 

10 10209 

-2 70209 

7 30129 

13 9 

10 12960 

+3 77040 

14 21592 

10 1 

11 53292 

-1 43292 

2 05326 

13 8 

10 04706 

+3 75294 

14 08456 

9 2 

9 30412 

- 10412 

01084 

9 8 

11 06515 

-1 26515 

1 60060 

12 2 

9 79941 

+2 40059 

5 76283 

9 3 

10 48731 

-1 18731 

1 40971 

12 2 

11 39534 

+ 80466 

64748 

10 5 

10 23967 

+ 26033 

06777 

11 1 

10 10209 

+ 99791 

99582 

207 6 

207 59099 

00001 

69 39083 


or, using formula (72), 


12234 78 - 5 34182(207.6) - .27516(3839 32) , 

20 
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The standard error of estimate is like the standard deviation, 
except that in the case of the latter the F values are subtracted 
from their mean, while in the case of the former they are sub- 
tracted from the regression Kne, i e., from the calculated Fc's. 
Notice in Table 51 that the deviations from regression add to 
zero, just as do mean deviations. If the distribution of Y values 
IS normal, two out of three of the observed F's will not vary 
from the regression line by more than one standard error of 
estimate on each side. This may be shown graphically by 
plotting in the range ^Sy from the regression line in Fig. 44. 
Adding and subtractmg 1.86 and Fc == 8 78 at Z = 12 5, and 
then 1.86 and Yc = 11.40 at Z = 22, gives a range of 
6 92-10 64 at the small end of the scale and a range of 

9 54-13 26 at the large end. Accordingly, only six counties — 
Buffalo, Calumet, Clark, Columbia, Dane, and Douglas — out 
of the 20 are found to fall outside the range + IjSy. Thus 30 per 
cent of the cases exceed the range, compared with 32 per cent 

* in a strictly normal distribution. This close agreement is in 
spite of the small number of counties in Table 50. 

There is, of course, seldom any reason for using a regression 
equation to calculate values of F for comparison with the data 
from which the regression equation was obtained. A regression 
equation is rather applied to new data for the purpose of making 
predtchons. For example, the usefulness of the regression 
equation (68), based on Table 50, lies in teUmg us what death rates 
to expect in counties that are not included in the table, or in a 
year other than 1935. 

Even in the prediction of individual F values when r is low, 
however, it is often possible to reach relatively safe conclusions 
by noting the odds in their favor. For example, the most 
probable value of F correspondmg to an Z value of 18.6 was 
found by substituting Z = 18 6 in equation (68), giving 
Yc = 10.46. In other words, if we know that a county had a 
birth rate of 18 6, we can predict that its most probable death 
rate is 10 46, and we can feel some confidence that its actual 
death rate will not usually be below 8.60 or above 12 32 (ie , 

10 46 ± 1 86) If we wish to be surer, the odds are about 
20 to 1 in a normal distribution that the death rate of this 
county will fall between 10 46 + (1 86 X 2), i e., between 6.74 
and 14.18. If practical certainty is required, only once in some 
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369 times in a normal distribution will the death rate exceed 
the range of 10.46 + (1.86 X 3), or 4 88 to 16 04, inclusive. 
The spread of possible error is now large, but the advantage over 
random guessing is still considerable. This is usually true even 
after making allowance for the fact that the distribution is not 
normal, and for errors due to sampling. 

The same principle applies to a variety of related questions, 
e.gr., What is the probability that a coimty with a birth rate of 
17 will have a death rate as low as 8 or as high as 12? Sub- 
stituting X = 17 m regression equation (68), we find Yc = 10, 
approximately. The difference between the expected death 
rate of 10 and a death rate of 8 or 12 is ± 2. If we regard the 
death rates of all counties whose birth rate is 17 as normally 
distributed about a mean of 10, with a standard deviation of 
Sy ^ I 86, then the difference ±2 lies 2 00/1.86 = 1.08 standard 
deviation units above or below the mean. Referring to a table 
of normal areas (Appendix Table 1), we see that practically 36 
per cent of the area of the curve falls between the mean and an 
ordmate at l.OScr. Hence we may say that a deviation as great 
as or greater than 1 08cr may occur above or below the mean 
100.0 — 2(36) = 28 times in 100. The odds are therefore 72 to 
28, or roughly 2\ to 1, against such an event. 

In equations (65) and (66), 6, which is the slope of the regres- 
sion line of Y on X, is called the regTess^on coefficient It is a 
useful measure, since it shows the number of Y umts that the 
most probable value of Y changes for each unit change in X. 
For example, in equation (68), Fc = 5 34 + 0 275X, the regres- 
sion coefficient is 0 275, which means that the most probable 
value of Y increases 0.275 of a F unit for every X umt that 
X increases. If the equation were Fc = 5 34 — 0.275X, the 
most probable value of F would decrease 0 275 of a unit for 
each unit that X increased. 

3. The Coefficient of Correlation : Ungrouped Data. — ^Although 
the table of X and F paired values (Table 50), the scatter 
diagram (Fig 44), the regression equation of F on X (formula 
(65)), the regression coefficient 6, and the standard error of 
estimate Sy give a great deal of information about the amount 
and nature of the relationship between two variables, X and F, 
none of them furnishes in a single figure an index of the amount 
of the relationship. This is supplied by the simple Pearsonian 
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coefficient of correlation, r, which for ungrouped data may be 
found from the foUowmg formula: 


2XY - NM,M„ 

- NM,^)iEY^ - NMff) 


(73)1 


Applsdng this formula to Table 50, 


3839 32 - 20(18 31) (10 38) 

^ V[6843 82 - 20(18 31)2][2234 78 - 20(1038)2]’ 

38 16 

"" V(138 7) (79 89)’ 
r = .36. 


Since r is a coeflSlcient that can vary only from 0 to ±1, this 
is not a high value, mdicating rather low relationship between 
the birth rates and death rates in the 20 sample counties of 


1 Alternative formulas, which are sometimes convement, are 

iVsXF - SXSF ^ 

^/lNxX^ - (SZ)21[ArsF2 - (SF)^/ 

0S7 + 6sxr - N (^y 

7-2 — T^FV ’ 

XV-l! (J) 


SXF - 


SXSF 


XXY - NM^My ^ 

Na-xCy 


r \/6yi&a:y, 

a-x^ + 

where D refers to the differences between the raw paired values 
known as the difference formula 

= _ 1. s £ . it = ^^y', 

NtTxO-y N <Tx <ry N 

(Tc 

r = — j 


(74) 

(75) 

(76) 

(77) 

(78) 

(79) 

(80) 

This is 

(81) 

(82) 


where <re is the standard deviation of the 77s calculated from the regression 
equation. See also formula (89). 
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Wisconsin. It is about what would be expected from the scatter 
diagram (Fig. 44) 

The labor of computing a correlation coefficient from ungrouped 
data can sometimes be reduced by dividing one or both series by 
some appropriate divisor, or by subtractmg an arbitrary constant 
from the values of either or both series. As will be seen, this 
does not affect the value of r The method also applies to the 
regression equation, provided the original values are restored. 

4. Size of Sample from Which r Is Calculated. — It is assumed 
throughout the discussion of this chapter that the coefficient of 
correlation, r, is not calculated from very small numbers of 
paired values, say less than 25. If this assumption is not met, 
and the data are regarded as a sample, many of the formulas 
given need correction. Smce small-sampling theory is omitted 
from this text, the student may see certain references listed at 
the end of this chapter for its treatment ^ 

5. The Meaning of the Correlation Coefficient, r — It has already 
been seen that the standard error of estimate, Sy^ around the 
regression line for Table 50 is approximately 1 86. The variance 
of the observed F’s is 



5 _ 2234 78 /'207 

20 \ 20 /' 


Cy^ = 4. 

If we compare with we shall have a measure known as 
the coefficient of alienation squared^ 

^ = 0-865. (84)2 

This shows that 86 5 per cent of the variance in county death 
rates remains in the form of “scatter” around the regression 

^ See, for example, Yule and Kendall, Ezekiel, Fisher, and Croxton and 
Cowden The student should not be misled by the circumstance that, m the 
example of Table 50, 20 pairs of values were treated as a large sample This 
was done only for convenience of illustration Strictly, small-sample 
methods should be used with 20 cases, although even for that size of sample 
it often makes no important difference. 

2 Compare the distances of the dots from the regression Ime and from the 
mean of the F’s m Fig. 44. 
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line, wMch is not controlled by the birth rates. Again, by 
formula (78), 



r2 = 1 - 


or 

= 1 — 

and 

r2 + ^ ^35^ 

That is, r-2 and together account for 100 per cent of the variance 
in y. Since we have just seen that indicates the percentage 
not controlled by Z, = 1 — fc 2 evidently indicates the per- 
centage controlled by X through the medium of the regression 
equation. Thus, above, = ( 36)2 = 13^ meaning that a cor- 
relation of r = .36 accounts for only 13 per cent of the variance 
of the Y series This interpretation of is further clarified by 
formula (82) squared, 



Here the numerator, <7c^, is the variance of the Yc series calcu- 
lated from the regression equation, so that its value is entirely 
controlled by X. 

Substituting the values of and k^ found in the illustrative 
problem above in formula (85), we get 

(.36)2 + S05 ^ 995^ 

or 99.5 per cent, the slight variation from 100 per cent being due 
to approximations in the calculation of and k^ 

Notice, in general, that an r as large as .71 is required to cut 
the variance of Y by 50 per cent (if = .50, then 

r = = .71). 

Where both X and Y are assumed to be built up of simple elements 
of equal variability aU of which are present m Y but some of which 
are lacking in X, it can be proved mathematically that measures that 
proportion of all the elements in Y which are also present in X. For 
that reason, in cases where the dependent variable is known to be 
causally related to the independent vanable, may be called the 
coefficient of determination.^ 

^ Mobdecai Ezekiel, Methods of Correlation Analysis^ p. 120, John 
Wiley & Sons, Inc., New York, 1930. 
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Although these assumptions seldom hold in practice, it is 
customary to regard as a better measure of relationship than r. 
At any rate, is a more conservative estimate. 

Does the correlation between the birth rates and death rates in 
Table 50 mean that the birth rate is the cause of the death rate? 
Obviously, being born is not the cause of dying Samtary 
conditions, medical service, and various other factors determine 
death rates. It happens, however, that infants are more sus- 
ceptible to death by disease than are older children and adults, 
so for this reason, other things being equal, the population mth 
the largest proportion of infants will have the highest death 
rate In general, it may be said that the presence of simple 
correlation between two factors may or may not be accompanied 
by a direct or efficient causal connection between them. Often 
simple correlation is due to common causes, as when teachers’ 
salaries and the amount of money spent for alcoholic beverages 
rise and fall together with changes in business conditions. There 
is much danger that this kind of correlation will be misinter- 
preted Sometimes, as in the case of the birth and death rates 
above, one factor is a necessary antecedent but not a direct 
cause of a correlated factor. Very rarely, two factors show a 
high but purely accidental correlation, as the yield of potatoes 
in Great Bntain with, say, smallpox epidemics in the United 
States. The safest interpretation is that the presence of corre- 
lation between two factors indicates that as one increases the 
other tends to increase or decrease, i e , they vary together to some 
extent. Why they vary together may be determined by further 
statistical and experimental methods, such as those of partial 
correlation and the laboratory, which seek to control the various 
interfering factors involved. 

Caution should be used in comparing two or more values of r 
It often happens that interfering factors, of which the investigator 
takes no account, cause two r’s that should be the same to differ 
widely, or two r’s that should differ widely to appear the same. 
Unless “other things are equal,” at least broadly, such compari- 
sons have little point. 

6. A Convenient Formula for the Regression Equation When 
r Is Known. — When the value of r is found before the regression 
equation is set up, the latter may conveniently be obtained from 
the equation 
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(Z - M.), 

(Tx 

(86) 

b 1 b 

II 

(87) 


Comparing formula (87) with formula (69), it is seen that' 


or 




( 88 ) 


7. Simple Linear Correlation Applied to Grouped Data, — ^The 
method of deahng with simple hnear correlation developed above 
applies to ungrouped data, such as shown in Table 50. In the 
case of grouped data, the principles and procedures are the 
same, except that formulas (89) through (92) are specially 
adapted for use with frequency tables. 


where 


r- ■ 

(89) 

'Lxy = 'Zfxyd^y — 

(90) 


(91) 

S2/^ = W - 

(92) 


X and y are mean deviates, dx represents unit step deviations 
from an assumed mean of the X’s, dy represents umt step devia- 
tions from an assumed mean of the T^s, N is the total frequency 
of pairs in the table, fx is the total frequency of pairs in an X class 
or column, fy is the frequency of pans in a 7 class or row, and 
is the frequency of pairs in a cell. These symbols appear in 
the margins of correlation Table 53. 

It is reasonable that the proportion of children in a state’s 
population should influence the percentage of the state’s income 
that IS spent for schooling. Let us measure the extent to which 
this is true. The data needed are in Table 52. For our purpose 
it is not necessary to weight the percentage figures by the state 
populations. 



186 


ELEMENTARY SOCIAL STATISTICS 


Table 52 — Percentace of Population under 19 Years op Age in 1930, 
AND Percentage That School Expenditures Were op All 
Income in 1928, by States* 



Per cent of popuia- 

1 Per cent school 

State 

tion under 19 years 

expenditures 

of age, 1930 

1 were of all in- 


(X) 

come, 1928 (F) 

Southeast 





Virginia I 

N Carolina 

44 

49 

4 

3 

2 

4 

61 

38 

S Carolina 

50 

6 

3 

16 

Georma 

Florida j 

46 

3 

1 

75 

39 

2 

5 

76 

Kentucky 

43 

9 

2 

29 

Tennesvsee 

43 

8 

2 

57 

Alabama 

47 

0 

2 

74 

Mississippi 

46 

6 

3 

94 

Arkansas 

45 

a 

2 

55 

Louisiana . 

Southwest 

44 

0 

2 

61 

Oklahoma 

44 

2 

3 

27 

Texas 

42 

6 

2 

57 

N Mexico 

46 

8 

3 

40 

Arizona 

Northeast 

42 

1 

3 

67 

Maine 

37 

3 

1 

93 

N Hampshire 

35 

2 

2 

14 

Vermont 

37 

0 

2 

24 

Massachusetts 

35 

1 

1 

85 

R Island 

37 

0 

1 

89 

Connecticut 

37 

0 

2 

46 

N. York 

33 

6 

2 

11 

N Jersey 

36 

1 

3 

20 

Delaware 

35 

9 

1 

91 

Pennsylvania 

39 

4 

2 

20 

Maryland 

37 

2 

1 

97 

W Virginia 

Middle States 

46 

1 

3 

21 

Ohio 

36 

1 

3 

05 

Indiana 

36 

5 

3 

93 

Illmois 

34 

9 

2 

28 

Michigan 

37 

7 

3 

92 

Wisconsm 

38 

0 

2 

95 

Mmnesota 

38 

3 

3 

55 

Iowa 

37 

2 

3 

82 

Missouri 

Northwest 

35 

7 

2 

46 

N Dakota 

45 

4 

6 

13 

S Dakota 

42 

5 

5 

78 

Nebraska 

39 

3 

3 

95 

Kansas 

38 

1 

4 

24 

Montana 

39 

0 

3 

96 

Idaho 

42 

8 

4 

02 

Wyoming 

39 

2 

3 

30 

Colorado 

38 

0 

3 

29 

Utah 

Far West 

46 

1 

3 

91 

Nevada 1 

31 

8 

3 

33 

Washington 

33 

7 

2 

80 

Oregon 

33 

1 

3 

31 

California 

30 

4 

3 

25 

United States 

38 

8 1 

2 

74 


♦From T J Woofteb, Jr, Landlord and Tenant on the Cotton Plantation, WPA 
Research Monograph V, 1936, p 141. 

The 48 pairs of values in Table 52 are hardly enough to justify 
grouping, but are convenient for illustrating the grouped method 
The entries in Table 53 are made from the ungrouped data of 
Table 52, as follows. X represents the percentage of the popu- 
lation under 19 years of age, and Y is the percentage that expendi- 
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tures for school purposes were of total income in 1928 The 
first state m Table 52 has X == 44.4, so it will fall Somewhere 
in col. 44 0-45 9 of Table 53. Since the corresponding Y value 
is 2 61, a tally is entered in row 2.40-2 79 of col. 44.0-45.9. 
Similarly, the second state has an X value of 49.3 and a Y value 
of 4.38, so a tally is placed m col. 48.0-49 9 and row 4.00-4.39 
of Table 53 ; and so on. After all the entries are tallied in the 
cells, the talhes are counted and replaced by numbers. 

In Table 53 we then see two ordinary frequency distributions, 
X and F, placed at right angles to each other and exhibiting 
a double classification. The large figures in the cells are the 
frequencies. Instead of making a scatter diagram, as we did 
with ungrouped data, let us estimate the mean of the F^s in 
each column of the table. Consider, for example, the column 
with the heading 34.0-35.9 We have for the mean 

(2 6X1 + 22X2 + 18X 2) 

5 

This may be marked by a small circle at the left side of the 
column, although if it did not interfere with reading the table 
it should be located at the mid-point of the column. Similar 
circles indicate the positions of the means of the other columns 
which have a frequency as large as five An inspection of these 
means shows that they have an irregular tendency to rise in the 
positive direction across the table This suggests some positive 
correlation between X and F. However, the circles form more 
of a curve than a straight line, rising to a peak in the 38.0-39.9 
column and then descending slightly. If we suppose that we 
are dealing with a sample thrown up by a particular set of 
causes, some of the irregularities may be due to random factors 
and a small sample. But even if we make allowance for the 
extreme cases in cols 38.0-39 9, 42 0-43 9, and 44.0-45 9, the 
curved effect is not lessened. To assume that the relationship 
is linear and estimate the amount of correlation on that basis 
will reduce the value of the coefficient slightly, compared with 
the use of a coefficient of curvilinear correlation. Smce we 
cannot deal with curvilmear correlation here, we shall use the 
simpler straight-line hypothesis. There is also some justification 
for this in view of the fact that the large scatter indicates a low 
correlation in any case. 
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The line of regression of Y on Z, and dotted lines representing 
± iSyj the values for which are worked out below, are drawn in 
the correlation table (Table 53). A study of them in relation 
to the entries in the correlation table should be helpful, just as 
it was in the case of the scatter diagram for ungrouped data 
(see Fig. 44). It appears from Table 53 that the actual relation- 
ship changes from strongly positive in the left half of the table 
to moderately negative in the right half, whereas the linear 
regression implies a constant positive correlation throughout. 
Also, the linear equation is far from fitting the data of the two 
halves of the table equally well. On the other hand, in only one 
column does the proportion of items falling outside the range of 
one standard error of estimate around the regression line exceed 
the normal one-third. In practice it would probably not be 
worth while to carry the analysis any farther. We shall, how- 
ever, use the table to show the steps involved in calculating the 
Pearsonian correlation coej0S.cient, r, the linear regression of Y 
. on the standard, error of estimate, Sy^ and other statistics, 
from grouped data. 

Proceeding with Table 63, we enter unit-step deviations in row 
(2) and col. (2). The entries in row (3) and col. (3) and in row 
(4) and col. (4) are familiar and should be obvious from the 
symbols. Next, we multiply each cell frequency first by d* and 
place the product in the upper right-hand comer of the cell, 
and then by dy and place the product in the lower left-hand 
corner of the cell. The products are then added by rows and 
the dy products by columns. Column (6) and row (6) are 
obtained by multiplying the entries in col (5) and row (5) by 
dy and respectively, and the products are summed over the 
column and the row ^ 

We finally substitute from Table 53 in formulas (90)~(92), 


2xy - 79 - (19)if = 69.1, 
( 25)2 


Zrc2 = 315 ~ 
22/2 = 317 - 


48 

( 19 )^ 

48 


= 302 , 

= 309 5 . 


1 There is also always a regression line of X on F, from which the most 
probable values of X may be calculated for given values of F. The two 
regression Imes are not the same To find the regression of X on F, simply 
change places with X and F m the equations given m this chapter. 

* As a check on the work, notice that in Table 53 cols. (3), (5), and (6) 
should have the same totals as rows (5), (3), and (6), respectively. 
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Substituting these values in formula (89), 

r = 69.1 ^ 69.1 ^ 69 1 

V(302) (309.5) VM9 305.7' 

r = .23. 

This value of r indicates very httle relationship. Nevertheless, 
for purposes of demonstration, we shall show the use of the 
formulas for finding the regression equation of F on X and the 
coefficient of ahenation, k We have 



But this value of fe is in terms of umt-step deviations or class 
intervals. To change it back to scale umts, 

hs = (94) 

tx 

where iy = class interval of F. 
lx = class interval of X. 

= .23 ^ = 0 046. 

CEyx Jlfy byx^^se* (95) 

ayx = 3.16 - .046(40). 
ayx = 1 32 

Therefore, substituting in formula (65), we have 
Yc = 1.32 + .046Z 

also. 


Sy^ = cryKl - (96) 



Thus an r of .23 leaves 95 per cent of a-y^ as scatter around the 
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regression equation, or improves prediction only 

r2 = 1 - === 05^ 

or about 5 per cent, in terms of the variance of the F's. 

The student is asked to check the plottmg of the regression 
line and the Imes showing the standard error of estimate in 
Table 53. 

Eegression equations (69), (86), and (87) also apply to grouped 
data. 

From the above, it is clear that there is little tendency for the 
percentage of income expended for schools to be proportionate 
to the percentage of children under 19 years old in the population 
when states are taken as umts and a linear relationship is assumed. 
Apart from the latter assumption, which has already been dis- 
cussed, it may well be objected that a state is a large area, 
within which very different relations between these two per- 
centages may exist Thus a large city and a rural county in 
the same state may be more sharply unlike in this respect than 
two cities in separate states. For this reason, the average 
relationship given for each state as a whole is likely to be unrep- 
resentative, and so to lack meaning. It would be much better 
if the data were available by school districts, in which case a 
higher correlation might be found. 

8. The Rank Correlation . — K method of linear correlation that 
takes account of the rank orders of paired items but disregards 
their values is sometimes used for rough work, or when the 
values of the items are not known. The formula is 


6SD2 

N{N^ - 1 )’ 


( 98)1 


where D is the difference between the ranks of a pair of items, 
and N is the number of pairs. 

As an illustration of the use of this formula, let us refer back 
to Table 50, and rank the counties with respect to their death 
rates and birth rates, as shown in Table 54. When there are 
ties, as between Douglas and Eau Claire counties in death rates, 
and Clark and Fond du Lac in birth rates, the tied items are 
given the mean of the ranks they would occupy if they were not 
equal, and the next item takes the rank just above the highest 

^ /) is the lower-case Greek letter rho. 
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rank used in finding the tied mean. For example, the ranks 7 

and 8 are averaged to give — = 7.5 as the mean rank of 

Clark and Fond du Lac counties, and Columbia county has the 
rank 9. 


Table 54 — Twenty Wisconsin Counties Ranked with Respect to 
Bikth Rates and Death Rates (Low to High) 


County 

Rank in 
birth rate 
(X) 

Rank in 
death rate 

(F) 

Differ- 
ence (D) 

D2 

Bayfield. . . 

1 

4 

- 3 

9 

Dodge 

2 

5 

- 3 

9 

Calumet. 

3 

1 

2 

4 

Douglas . 

4 

17 5 

-13 5 

182 25 

Dane . 

5 

19 

-14 

196 

Burnett 

6 

10 

- 4 

16 

Clark 

7 5 

3 

4 5 

20 25 

Fond du Lac 

7.5 

13 

- 5 5 

30 25 

Columbia 

9 

20 

-11 - 

121 

Buffalo . 

10 

2 

8 

64 

Florence. 

11 

12 

- 1 

1 

Barron 

12 

11 

1 

1 

Adams 

13 

7 

6 

36 

Dunn 

14 

6 

8 

64 

Chippewa 

15 

16 

- 1 

1 

Door. 

16 

8 i 

8 

64 

Eau Claire 

17 

17 5 

- 5 

25 

Brown 

18 

14 

4 

16 

Ashland . 

19 

15 

4 

16 

Crawford 

20 

9 

11 

121 

Total 




972 


Substituting in formula (98), 


P 


6(972) _ , 

20(400 1) * 


Like r, the value of p may vary from +1.0 to —10. 


Exercises 

1. a. What is the amount of relationship between the length of French 
and English words in the accompanying table? Plot the data, and dis- 
cuss the scatter diagram. Is the relationship reasonably linear Use 
both the ungrouped and the grouped methods of calculating r as a 
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check. Do the two methods necessarily give exactly the same value of 
r? Explain. Just what does r mean in this case? 


NtTMBEE OF Letters m a Sample of French Words (X), and m Their 
Nearest English Equivalents (Y) 


X 

Y 

X 

^ 1 

X 

Y 

X 

Y 

1 

2 

4 

5 

6 

7 

8 

8 

9 

9 

6 

6 

2 

4 

9 

10 

8 

8 

6 

4 

3 

6 

5 

5 

6 

7 

8 

9 

8 

7 

9 

9 

4 

7 

5 

10 

7 

7 

4 

5 

7 

7 

4 

4 

5 

5 

5 

4 

7 

7 

10 

17 

6 

6 

3 

3 

6 

6 

5 

6 

5 

4 

6 

5 

8 

11 

8 

9 

8 

10 

1 12 

11 

8 

8 

5 

6 

4 

6 

5 

5 

8 

11 

5 

4 

11 

8 

8 

9 

8 

7 

3 

3 

5 

3 

11 

11 

8 

7 

4 

3 

10 

10 

8 

8 

9 

8 

7 

5 

8 

8 

7 

t ^ 

7 

5 

7 

8 

11 

8 

7 

8 

5 

5 

6 

7 

10 

8 

7 

7 

6 

5 

9 

7 

8 

10 

7 

7 

12 

9 

8 

6 

11 

9 

6 

5 

5 

8 

5 

^ \ 

10 

8 

10 

9 

8 

7 

6 

2 

10 

7 

9 

9 

10 

10 

5 

4 

7 

6 

9 

8 

7 

8 

8 

8 

11 

10 

8 

9 

5 

8 

6 

7 

9 

9 

9 

11 

8 

9 

9 

9 

13 

6 

8 

7 

7 

8 

9 

8 

8 

9 

5 

6 

10 

11 

8 

7 

9 

7 

7 

7 

8 

8 

3 

6 

8 

10 

7 

8 

3 

3 

11 

17 

9 

10 

8 

5 

7 

7 

5 

4 

6 

7 

5 

6 


h Get the regression of 7 on X from both the ungrouped and the 
grouped data, as a check, plot the Ime, and explain what a and b mean 
in the equation, 

c. What is the most probable length of an Enghsh word correspond- 
ing to a French word of six letters? 
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d. Within what range will the number of letters in the English words 
in (a) fall two times out of three? Ninety-five times out of 100? 
Practically always? 

e. What is the value of the coefficient of ahenation squared, and 
what does it mean here? 

/. What IS the coefficient of determination and its interpretation 
in this problem? 

g Find the coefficient of rank correlation, p, for the same data, and 
compare its value, meaning, and adequacy with r, 

2. For the table below, find the value of r and of 6, and compare 
them in meamng 


Age of Fathees (F) Coekelated with Age of Sons (Z) 


z 

F 

25 

27 

29 

60 

3 

5 

7 

65 

2 

11 

14 

70 


2 

6 


3. Find and for the following table, and explain their meamng 

Number of Children in the First Generation op Six Families (Z), 
AND THE Average Number of Children in the Second Generation 
of the Same Families (F) 


Z 

3 

4 

6 

7 

9 

i 

15 

F 

3 

2 

4 

4 

5 

5 


4. a. By inspection, is there any relationship between the votes of 
the states in 1876 and in 1932*!^ If any, is it positive or negative‘s 


Republican Vote for President in Nine States, 1876 and 1932 


State 

1 Per cent of vote Republican 

1876 1 

1932 

Massachusetts 

58 

48 

New York . ... 

48 

41 

Wisconsin 

55 

32 

Missouri ... 

41 

35 

Virgima . . 

41 

30 

Mississippi. 

21 

4 

Louisiana 

48 

7 

Nevada 

53 

31 

California ... 

51 

39 
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5. Wliat does the scatter diagram show? 

c What IS the equation of the regression of Y on X, where X is the 
percentage of the vote Repubhcan m 1876, and Y is the percentage of 
the vote Repubhcan in 1932? Plot the line in the scatter diagram. 

d. What is the standard error of estimate? Plot it in the scatter 
diagram. 

e. What is the most probable percentage of the vote Repubhcan in 
1932 of a state that voted 55 per cent Repubhcan in 1876? 

/. Assuming a normal distribution about the regression hne, within 
what Hmits of error will the percentage vote fall two out of three 
times? 20 out of 21 times? Within what hmits of error does it 
actually fall in each case? 

5. a. What does the scatter diagram show m the case of the accom- 
panying table of death rates in Connecticut and Massachusetts? 


Death Rate m Connecticut and Massachusetts* 


Year 

1924 

1923 

1922 

1921 

1920 

1919 

1918 

Connecticut 

11 3 

12 0 

12 0 

11 4 

13 6 

13 3 

20 4 

Massachusetts 

12 0 

13 0 

12 8 

12 2 

13 8 

13 6 

20 9 


* From B H Camp, The Mathematical Part of Elementary Statistical p 144 D C Heath 
& Company, Boston, 1935. 


5. If the death rate for Massachusetts is 12 in 1924, what is the most 
probable death rate for Connecticut in the same year m terms of the 
relationship between the two? 

c How much of the vanance still remains as scatter in predicting a 
death rate in Connecticut from one in Massachusetts? 
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CHAPTER XI 


GROSS RELATIONSHIP BETWEEN TWO FACTORS: 
NONQUANTITATIVE CORRELATION 

T]i 6 iHGtliod. of corrolEtion dcscribGci up 
to this point has dealt with quantitative series only, e g., birth 
and death rates, and proportion of state income spent for educa- 
tion. It often happens in sociological investigations, however, 
that it is needed to know the amount of relationship between 
two factors, one or both of which are qualitative. Examples of 
quahtative factors are rural or urban residence j personahty 
ratings like Anno 3 nng, Unsympathetic, Sympathetic j occupa- 
tional classes— Professional, Propnetor, Clerical, Skilled, 
Unskilled; and so on Methods for correlating data of this 
type have been devised. Before using them, effort should be 
made to convert the qualitative attributes mto quantitative 
variables, because the latter are usually more accurate and 
reliable. Thus, a student might be classified by the number of 
credits earned in college, rather than as Sophomore or Jumor. 

2. Reliability of Classification. — Smce much depends on the 
rehability with which the nonquantitative variables are classified, 
it is advisable to have the classification repeated by two or more 
qualified persons. If the results are very different, better 
criteria for classification should be developed, or the problem 
dropped. 

This point may be illustrated. The questionnaire that the 
members of a class in statistics filled out regarding their previous 
training in mathematics called for the sex of each student. If 
it were desired to correlate success in mathematics with sex, 
the members of the class imght be divided by sex, and then 
subdivided into, say, four groups according to the average 
grades received in mathematics This would give a table like 
Table 55. 

The question of the reliability of the classification by class 
standing in this table can be dismissed, because it is based on a 
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quantitative variable, the average grades received in mathe- 
matics. The sex classification might be somewhat unrehable 
if it depended merely on the Christian names of the students 
in the questionnaires, but reference to the questionnaire used 
shows that the students checked the words Male and Female. 
The rehabihty of this classification can therefore also be accepted 
with confidence We may then proceed to find the amount of 
relationship between the two factors in the table. 


Table 55 — Students in a Statistics Class Grouped by Sex and Grades 
Received in Mathematics 


OCX 

1 

2 

3 

4 

Total 

Male 

4 

6 

6 

3 

19 

Female 

7 

13 

15 

11 

46 

Total 

j 

11 

19 

21 

14 

65 


Students by class standing 


All classifications are not so simple as those in Table 55, how- 
ever. In Table 56, for example, a second competent person 
classified only 66 cases out of each 100 in the same way that this 
table shows, with respect to the economic status of the family. 
This was considered sufficient reason for abandoning the table. 


Table 56 — Economic Status op the Family in Which Parolee Was 
Reared and Outcome on Parole 


Status 

Parolees 

Parole violators 

Number 

Per cent 

Poor 

287 

44 

15 3 

Moderate 

261 

26 

10 0 

Comfortable 

59 

6 

10 2 

Unknown 

22 

3 

* 

Total 

629 

79 



* Sample too small to warrant an estimate 


3. Choice of a Method. — ^After the reliability of the classifica- 
tions in a nonquantitative correlation table has been established, 
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the (question of how to calculate the amount of relationship 
between the two factors in the table arises. The answer depends 
on the nature of the particular factors to be correlated. It is 
convenient to set up a key, as in Table 57, which will suggest 
what method should be used in each case 

The terms m Table 57 need defimtion and illustration. Quan- 
titative means expressed in countable units, as crime rates or 
heights of male freshmen Qualitative refers to nonmeasured 
traits, hke those mentioned in the first paragraph of this chapter. 
Qualitative Ordered refers to quahtative categories that can be 
arranged in ascendmg or descending order, as Favorable, Indif- 
ferent, Hostile. Qualitative Unordered applies to qualitative 
categories that cannot be arranged in ascendmg or descending 
order, e g., Law, Medicine, Engineering. A dichotomous series 
is a series of two mutually exclusive and exhaustive categories, 
as Good, Not Good; Sick, Not Sick; Male, Female, College 
Graduates, Others; Families with Less than Four Children, 
Famihes with Four or More Children 

Table 57 — Key to Selected Methods of Nonquantitatite 


COKRELATION 

Variable A 

Variable B 

Method 

Quantitative several classes 

Dichotomous 

Biserial, rba 

Quantitative or quahtative* 
ordered or unordered, 
several classes 

Quahtative ordered or un- 
ordered; several classes 
or dichotomous 

Contmgency, C 

Dichotomous 

Dichotomous 

Tetrachoric, rt 
Yule^s Q 

Fourfold r4 


It is not feasible to deal here with more than the five methods 
listed in Table 57, though a number of less prominent methods 
are omitted. 

4. Biserial Correlation. — In a study of divorce data for the 
United States in 1929, it is desirable to know whether there is 
any correlation between the party to whom the divorce was 
granted and the number of children affected. The data are 
shown in the fust four columns of Table 58. We have here a 
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quantitative series to be correlated with a dichotomous series. 
According to the key in Table 57, this requires the biserial 
method of correlation. 

The biserial method that we shall employ assumes that the 
dichotomous trait is normally distributed and continuous (f e , 
there is no gap in the series, and no disarrangement of an ordered 
series). The relationship must be hnear, or that of a straight 
Ime. In the present case the idea of normality at first seems to 
have little meaning. However, if we think of the possibility of 
measuring the extent to which the husband or the wife is respon- 
sible for the granting of the di- 
vorce, and if it is reasonable to 
suppose that one party will sel- 
dom be wholly the instigator, 
but that in most cases both will 
be about equally involved, we 
may perhaps assume that the 
distribution of the dichotomous 
factor is fairly normal. 

Since all reported divorces 
are included, the series is con- 
tinuous As a rough test 
whether or not the relationship 
is finear, the scatter diagram shown in Fig. 47 is used. In this 
figure are plotted the percentages of the divorces granted to hus- 
bands by the number of children affected. The trend, if any, is 
very irregular. Where there are no children, a much larger per- 
centage of divorces is granted to the husband than where there are 
ulnldren. WTien the number of children is very large — i e , 
eight or nine — ^the proportion of divorces granted to the husband 
falls to a mimmum When the number of children affected 
ranges from one to seven, the proportion of divorces granted to 
the husband remains practically stationary. The low percent- 
ages of divorces granted to the husband when there are eight or 
nine children may be unreliable because of the small number of 
cases involved; but the circumstance that the percentage is low 
both for eight children and for nine children tends to support 
the observed figures. There seems to be little reason for calcu- 
lating the value of in this case. We shall do so merely to 
show the method. 



Number of children 
Fig. 47. — Relation of percentage of 
divorces granted to husband and num- 
ber of children involved 
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Table 58 Divorces Granted, Classified According to ^Number of 
Children Affected: 1929* 


ChildreD 

affected 

(1) 


Divorces granted 


To husband 

(2) 

To wife 

(3) 

Total 

(/) 

(4) 

Per cent to 
husband (g*) 

(5) 

0 

36,840 


113,810 

3237 

1 

8,385 

32,223 1 

40,608 

2065 

2 

4,255 

15,242 

19,497 

2182 

3 

1,841 

6,161 i 

8,002 

2301 

4 

774 

2,571 

3,345 

.2314 

5 

352 

1,191 

1,543 

2281 

6 

155 

518 

673 

2303 

7 

68 

245 

313 

.2173 

8 

22 

108 

130 

1692 

9 

16 

77 

93 

1720 

Total 

Mean 

52,708 

0 55 


188,014 

0 71 

2803 


*From Marriage and Divorce, 1929, p 41, U S Bureau of the Census “Nine or more 
children” taken as nme, and “no report as to children” disregarded. 


Apparently, the relationship in Table 58 is not linear We 
shall work out the correlation, however, on the assumption that 
it IS linear The difference is ummportant here. 

The formula for finding biserial r is 

( 99 ) 

O' y 

where mi is the mean of the smaller frequency distribution [cols. 
(1) and (2) of Table 58], M 2 is the mean of the larger frequency 
distribution [cols. (1) and (3)], a is the standard deviation of the 
total frequency distribution [cols. (1) and (4)], p is the propor- 
tion that the total frequency of the larger distribution [col. (3)] 
is of the grand total frequency [col (4)], q — 1 p, and y is the 
height of the ordinate of a normal curve of unit area and unit 
standard deviation at the point separating the area of the curve 
into the proportions p and g, as found from Appendix Table 1. 
The means and standard deviation required are calculated by 
the usual method of umt-step deviations from an assumed mean 
The required values are 
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mi = 0 55, 
Mi = 0.77, 

(T = 1 14, 

_ 135,306 
^ 188,014 

g = 1 00 - 


= .72, 

72 = 28, 


To find y, we turn to Appendix Table 1 In Fig 48 a normal 
curve is shown. As explamed elsewhere, the values given m the 
body of the table represent the proportion of the area of the 
curve included between the mean ordinate (shown at zero in the 
figure) and ordinates erected at various distances, measured m 
standard deviation units, from the mean. Since here y = .72, 



Fig 48 — Normal curve used to find value of y m formula (99). 


we need to find the height of the ordinate which divides the 
curve so that ,72 of its area falls to the left and .28 to the right. 
Evidently, .72 of the area will occupy the whole left half of the 
curve, and a proportion .72 — .50 = .22 will extend into the 
right half. Looking for .22 in the column of the table headed 
“Area,’’ we find as the nearest approximation to it the figure 
0 2190, and note that the corresponding figure in the column 
headed “Ordinate {y) ” is 0 3372 We therefore have i/ == 0 3372, 
and are ready to substitute in formula (99) : 

/O 77 - 0 55\ (72)(28) 

nr- jv337^- 

rbia = .12. 


As would be expected from our preliminary analysis of Table 58 
and Fig 47, the amount of linear relationship between divorces 
granted to husbands and the number of children affected is very 
slight 
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The sign of indicates the direction of the relationship 
between the quantitative factor and the proportion of cases in 
the distribution represented by p m formula (99) Here there 
IS a slight positive association between number of children and 
divorces granted to the wife, or a shght negative association 
between number of children and divorces granted to the husband 

The general conclusion from this analysis is that, if any corre- 
lation is present at all, there is a very shght tendency for the 
husband to receive the divorce relatively less often as the number 
of children mcreases. Much more informative, however, was 
the interpretation made from the scatter diagram in Fig 47, 
that the proportion of divorces granted to the husband (1) was 
considerably greater where there were no children at ail, (2) was 
httle affected by increases in the number of children from one to 
seven, and (3) was a mimmum when the number of children 
was eight or more. 

Biserial correlation is a special adaptation of the method of 
- correlation used in findmg the Pearsoman coefficient of correla- 
tion, r, for quantitative data. For this reason, rb« may be 
regarded as the nearest approximation to r that can be found 
when a quantitative series is correlated with a dichotomous 
series. 

6. The Coefficient of Contingency. — total of 1,118 inmates 
of a state prison were classified as murderers, sex offenders, and 
property offenders. It was wanted to know how much, if any, 
correlation existed between these three criminal types and intel- 
ligence. An intelligence test was given to all the men, with the 
results shown in Table 59 This table contains one quantitative 
series and one unordered qualitative classification. The key 
in Table 57 indicates the method of contingency for findmg the 
amount of association present. This coefficient is based on the 
Chi-square (x^) method, and measures the amount of deviation 
of the observed frequencies in the table from purely random or 
chance frequencies. The method of finding the chance or 
theoretical frequencies, /j, is based on two elementary theorems 
in the mathematics of probabihty w'hich have already been 
treated (see Chap IX). Thus, the probability that any criminal 
will fall in, say, the first column of Table 59 is the ratio of the 
total number that fall in that column to the total frequency 
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of the table, or Ui/N = 70/1,118, where w, is the total frequency 
in col. (1), and N is the total frequency of the table. Likewise, 
the probability that any criminal will fall in, say, the first row 
is the ratio of the total number that fall in that row to the total 
frequency of the table, or /n = 17/1,118. Now the probability 
of two independent events occurring together is the product 
of the probabilities of their separate occurrences. Therefore, 
the probability that any criminal will fall m both the first column 
and the first row of the table is 


/ nA / in\ _ inui _ / 70 \ / 17 \ „ „„„ 

\n) \n) ~ VlJTsj VLlIsj "" 0.000952. 

This means that about one out of every 1,000 prisoners in Table 
59 may be expected by chance alone to fall in the cell common 
to col, (1) and row (1). Smce there are 1,118 prisoners in the 
table, the expected frequency is 


inni(iV’) 


17(70) 

(1,118)^ 


(1,118) = 1 0644. 


This formula may evidently be shortened, however, to 




nux 


( 100 ) 


We now 


17(70) 

giving for the above ft = = 1.0644, again. 

write this expected frequency in row (1) and col. (2) of Table 59. 
By use of formula (100), all of the expected frequencies are 
calculated and entered in cols, (2), (7), and (12). This compu- 
tation is more easily done for any column by setting n^/N in 
the calculating machine and multiplying it successively by the 
total row frequencies, jU. It is a general principle of the test 
that no cell should contain much less than five expected fre- 
quencies. Any cell that offends in this respect should be com- 
bined with the cell above or below it. For this reason, in Table 
59 the frequencies of the first row and of the last two rows are 
combined with those just below or above. Comparing now the 
observed with the theoretical frequencies, we notice a consider- 
able amount of difference This indicates some association 
between the criminal classifications and intelligence. We 
proceed to measure it by computing 
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I 


= 19.087 + 25.673 + 7.230 = 51.990 


( 101 ) 


Substituting in the formula for the coefficient of contingency, C, 


C = 

c = 

(7 = 


4 

4 


N + x^ 


51.99 


1118 + 51 99 


= V.0444, 


.21 


( 102 ) 


The amount of association between the types of criminals and 
intelligence is seen to be low. If we regard our 1,118 prisoners 
as a random sample, what is the probability that the value of C 
is zero in the total population from which it was drawn? Before 
we can refer this question to a table of (Appendix Table 2), 
we must have regard for the proper degrees of freedom. It will 
be recalled^ that in each row and column of a contingency table 
(e g , Table 59), one of the cell frequencies is not ^^free,^^ because 
it may be determined by subtraction from the marginal totals. 
In any row or column, therefore, the number of free cell fre- 
quencies, or degrees of freedom, is one less than the number of 
cells (columns or rows). In Table 59 there are three columns 
and six rows, so that the degrees of freedom for the whole table 
are (3 — 1) (6 — 1) = (2) (5) = 10. With 10 degrees of freedom, 
we find in Appendix Table 2 that a x^ as great as 23 would occur 
by chance only once in 100 trials. Since our x^ = 52 is still 
larger, we can be sure that the differences are not random. 
That is equivalent to sa 3 dng that the value of C indicates a low 
but genuine association between types of criminals and 
intelligence 

C IS usually found from a shorter formula than that used above: 


c . ^1?^, (103)> 

where 


S = 



(104) 


^ See Chap. IX, p 148 

2 For the derivation of this formula, see Karl J Holzmger, StahsUcal 
Methods for Students %n Educahony p 275, Gmn and Company, Boston, 1928 
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The value in parentheses in (104) is calculated for each cell of the 
table, and these cell values are summed over the table. Thus 
for the cell 80-89 m col. (6), 



(15)» 


248(123) 


= 0074. 


Formula (103), however, does not provide a value of x® by 
which to test the significance of the association found. 

The coefficient of contingency has the defect that it under- 
states the amount of correlation actually present, in inverse pro- 
portion to the number of cells in the table. For a 3 X 3 table 
having perfect correlation, C would not be 1.00, as it should, but 
.816; for a 5 X 5 table, the maximum value of (7 is 894; for a 
7X7 table, .926; for a 10 X 10 table, 949. Evidently C is not 
comparable between tables with different numbers of cells. 
For these reasons, it is well to apply C only to tables having 
say from 25 to 100 cells. 

It is possible to correct C to some extent for the above fault 
in cases where the correlation table has a fairly normal surface, as 
shown by the row and column totals in ordered senes For this 
purpose Table 60 may be used in connection with formula (105) : 

<? - £ (106) 


If for the moment we regard Table 59 as normal, we have from 
Table 60 for three columns tc = 859, and for six rows U = .959, 
so that 


-21 

.959 ( 859) 


= .25. 


Table 60 — Factors for Correcting C for Broad Grouping* 


Number 

1 

Correction Factor 

(Uj tc) 

Number 

Correction Factor 

(trj tc) 

2 

798 

9 

.981 

3 

859 

10 

985 

4 

915 

11 

.987 

5 

943 

12 

.989 

6 

959 

13 

991 

7 

970 

14 

.992 

8 

976 

15 

.993 


* From. C C Peters and W R Van Voobhis, Statistical Procedures and Their Mathe- 
matical Bases, p 398, McGraw-Hill Book Company, Inc., New York, 1940. 
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The change in the value of C in this case is slight, and will 
always be so where the original value of C is low The correction 
is therefore worth making only when the value of C is fairly 
high. Moreover, in the present case, one of the series m Table 
59 IS unordered, so we are not justified in regarding it as approxi- 
mately normal in form, or in applying this correction to the G 
obtamed from it. 

A coejficient of contingency, C, needs perhaps even more 
careful interpretation than other coefficients of correlation. In 
the first place, it has no sign, so that its meaning is dependent 
upon an examination of the correlation table itself. When both 
series are ordered, it is possible to assign a sign to C,* otherwise, 
not. In Table 59, the prisoner classification is unordered, so 
the C we found can have no sign. Notice also that the sizes 
of the for the three classes of criminals are not comparable, 
because the number of prisoners is different in each class. We 
may, however, compute the mean I Q. for each of the three 
classes, and m that way note how they compare in intelligence. 
Thus we find that property offenders are most intelhgent with 
an I Q. of 79 96, while murderers and sex offenders are approxi- 
mately equal with I Q ’s of 75 71 and 74 51, respectively. If 
the categories were Life Sentence, Medium Sentence, Short 
Sentence, instead of Murderers, Sex Offenders, Property Offend- 
ers, the sign of C might be regarded as negative, since intelligence 
increases as the length of prison sentence decreases. If neither 
factor in the table was quantitative, means could not be com- 
puted. In that case, we could only compare the columns with 
respect to the proportions of their frequencies falling in each 
category of the stub. 

6. Correlation in Fourfold Tables. — ^Any scale may be divided 
into just two parts, or dichotormes. For example, we may meas- 
ure head lengths, and then classify heads below a certain length 
as short, and those of this length and above as long. Many 
sociological variables that have never been measured are com- 
monly treated as dichotomies, e g , Cooperative, Not Coopera- 
tive. Some information is gained if a more detailed breakdown 
is feasible, such as Completely Cooperative, Very Cooperative, 
Average Cooperative, Uncooperative, Completely Uncooperative. 

Some qualities are most conveniently regarded as attributes 
rather than as quantitative variables, and naturally take a 
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dichotomous form. Examples are Violator of Parole, Non- 
violator of Parole; White Race, Other Race. 

The measurement of the amount of relationship in a 2 X 2 
table is usually rather rough and inexact, regardless of what 
method is used. On this account, such a table is often merely 
tested for the presence of relationship, without attempting to 
measure it. The Chi-square test, explained in Chap. IX, is 
commonly relied on for this purpose. 

Suppose we are interested in whether or not there was any 
association between the occupation of agriculture and the 
tendency to commit crime in a given state over the period 1920- 
1930. Table 61 gives all the information at hand bearmg on the 
question, together with the scheme of symbols used in a short 
formula for adapted to a 2 X 2 table. 


Table 61 — Occupational Distribution op the Adult Male Prison 
AND Nonprison Populations of a Given State, 1920-1930 


Occupational 

Mean prison 

Mean nonprison 

Total 

classification 

population 

population 

Agriculture 

690 {u) 

1,100,000 (o) 

1,100,690 (in) 

Nonagriculture 

2,310 (w) 

900,000 (a:) 

902,310 (in) 

Total 

3,000 (n,) 

2,000,000 (nj) 

2,003,000 (A) 


(ux — vwYN 


(106) 


Substituting in this formula, 

, _ [(690) (900,000) - (1,100,000)(2,310)]22,003,000 
(3,000) (2,000,000) (1,100,690) (902,310) ' 

= 1,239 


Entering Appendix Table 2 with one degree of freedom, as we 
did in Chap IX, we see that so large a value of would occur 
by chance much less often than once in 100 times. We may, 
therefore, regard the presence of association between the occupa- 
tion of agriculture and the commitment of crime in Table 61 
as estabhshed beyond doubt. 

If it seems worth while to go farther than the x^ test, and try 
to estimate approximately the degree of association in a 2 X 2 
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table, there are several coefficients available. They are ba^ed 
on different principles, however, and give different results. 
We shall illustrate three such coefficients, namely, Yule’s Q, 
the ordinary coefficient of correlation adapted to fourfold tables, 
r 4 , and the coefficient of tetrachoric correlation, Vt Where one 
of them will not meet the needs of a particular problem, another 
usually will. 

The formula for Yule’s Q is Q = , — ? (107) 

where the symbols refer to cell frequencies as shown in Table 
61. Let us apply it to the data of Table 61. Substituting, we 
have 

^ _ (690) (900,000) - (1,100, 000)(2, 310) 

^ (690) (900,000) + (L 100,000) (2,310) ' 

Q - -.61. 

According to this coefficient, there is a moderate amount of 
negative association between the occupation of agriculture and*^ 
imprisonment for crime in Table 61, or, more generally, between 
the first column and the first row factors, when the positive* 
and negative factors {e.g. Prison Population, Nonprison Popula- 
tion, Agriculture, Nonagriculture) are arranged as in the table. 
The result appears reasonable when it is noted that men usually 
engaged in agriculture formed only 690/3,000 = 0.23 of the 
pnson population, but 1,100,000/2,000,000 = 0.55 of the non- 
prison population. 

Notice that Q = 0 if = vx, or if u/w = v/x] that Q — +1 
if V and/or ti? is 0; and that Q = — 1 if w and/or x is 0. In other 
words, in Table 61, Q would show (1) zero association if the cell 
frequencies represented a purely random distribution of the 
table totals; (2) perfect positive association if all of the prison 
population, and/or none of the nonprison population was 
engaged in agriculture; (3) perfect negative association if none of 
the prison population, and/or all of the nonprison population 
was engaged in agriculture. The requirement for perfect asso- 
ciation is less stringent than if ^‘and/or” was replaced by “and” 
above, but Q is appropriate for treating the data of Table 61, if we 
are interested in the proportion of the prison population drawn 
from agriculture, as compared with the proportion of the non- 
prison population drawn from agriculture. 
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Should we want to measure the extent to which farmers and 
prisoners are strictly identical or exclusive categories, vre may 
use the formula 


Ti = 


ux — vw _ X 

\/ UiU^ \/N 


(108) 


which assumes v = w = 0 (Table 61) for perfect positive associa- 
tion, u = X — 0 for perfect negative association, and (like Q) 
fTW iixtoT no association. For Table 61, 

(690) (900,000) - (1,100,QQQ)(2,310) 

V(l, 100,690) (902,310) (3,000) (2,000,000)’ 
n = -.025. 

In view of the fact that the proportion of agriculturalists in 
the prison population was under half that in the nonprison 
population, the value of seems to be entirely too low’, wMe 
the value of Q is about what would be expected It seems extreme 
,to insist that for perfect negative correlation the total nonagri- 
cultural population, but not a single farmer, must be in prison, 
as the formula for r 4 requires. For other problems, however, 
Ti may be more appropriate than Yule’s Q. This suggests that 
the choice of a measure of correlation should be adapted to the 
particular problem and interest of the investigator. 

The two coefficients, Yule’s Q and r 4 , are both designed for the 
special case where the frequencies are impressiomsticaUy divided 
mto two groups, or, in geometric temas, roughly collected at two 
discrete points. In Table 61, these points are Agriculture and 
Nonagriculture for one factor, and Prison population and Non- 
prison population for the other. 

When the frequencies are distributed along two quantitative 
scales, and on each scale they are divided into tw^o groups by a 
mark on the scale, and it is desired to find the amount of correla- 
tion between the paired scale values rather than between the 
proportions of cases in the two dichotomies, the so-called tetra- 
chonc method is appropriate if the underlying mathematical 
assumptions mentioned below can be met. In Table 63, the 
factor, size of household, is reduced to two classes from the 
quantitative distribution on Table 62 j the other factors. Relief 
and Nonrelief, is quahtative, like the categories of Table 61. ^ 

There are difficulties in the computation of the tetrachoric 
coefficient, n, but an approximate formula is. 
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where, reading from a table of normal areas and ordinates 
(Appendix Table 1), 

h is the - value at .5 — or .5 — ^ 
cr N N 

k is the - value at .5 ~ or 5 — ^ 

<r N N 

H is the height of the ordinate at h, 

K is the height of the ordinate at k, 
and the other symbols have the same meanings as in Table 61. 

The derivation of formula (109) assumes that both of the 
series (e g,, size of households and rehef-nonrelief) are normally 
distributed, that both dichotomies are continuous, that the 


TajbIxE 62 — DiSTRiBunoN of Rxjbal Relief and Nonreliep Households 
BY Size, October, 1933* 


Size of household 

1 Households 

Rehef 

Nonrehef 

Total 

10 persons and over 

290 

1 246 

536 

9 persons 

202 

1 213 

415 

8 persons 

353 

1 336 

689 

7 persons 

493 

560 

1,053 

6 persons 

633 

997 

1,630 

5 persons 

834 

1,322 

2,156 

4 persons 

846 

2,061 

2,907 

3 persons 

1 846 

2,408 

3,254 

2 persons 

745 

2,430 

3,175 

1 person 

358 

627 i 

985 

An households 

5,600 

11,200 

16,800 


* Adapted from Thomas C McCormick, Comparative Study of Rural Relief and Non- 
relief Households, p S8, Research Monograph II, Works Progress Administration, Division 
of Social Research, Washmgton, D C , 1935 Mid-pomt of last interval taken as 11 


^ An alternative formula is 

rt — — cos 


where 


[ TT \/vw ~| 
(^yux + \^vw)J 


TT 


180°, 


( 110 ) 


the symbols are arranged as m Table 63, and the sign of r* is interpreted as m 
the case of Yule’s Q above 
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total frequency of the table is large, that the dichotomous divi- 
sions are not made too far toward the extremes of their dis- 
tributions, and that the relationship is linear. If the table is 
not normal, the value of is affected by the point of division 
of the dichotomies, ^ e , by whether each series is divided in the 
middle of the scale or at some other point. 

In view of these restrictions, it hardly seems legitimate to 
apply r< to Table 61 above. As Fig 49 shows, the dichotomous 


y 



Fig. 49 — Proportion of adult male population m prison, Table 61. 

line is drawn at the far upper end of the distnbution of crimmal- 
ity, where the value of Vt is very sensitive to any skewness in Mie 
tail of the curve. 


Table 63 — Number of Rural Relief and Nonrelief Households 
Containing Less than Four Persons, and Four Persons and 
Over, October, 1933* 


Size of household 

Frequency 

Rehef 

Nonrehef 

Total 

4 persons and more 

3,651 («) 

5,735 (i>) 

9,386 (in) 

3 persons and less 

1,949 (w) 

5,465 (i) 

7,414 Un) 

All households 

5,600 (ni) 

11,200 (n.) 

16,800 (N) 


* The data in the table should be so arranged that the value of the independent factor (siae 
of household) increases from the bottom row to the top row, and the value of the dependent 
factor (economic mdependence) mcreases from the first column (rehef) to the second column 
(nonrehef). 

An inspection of Table 62 suggests that the distnbution by 
size of household is somewhat skewed. We can test this, how- 
ever, by shifting the position of the dichotomous line, and noting 
the effect on the value of If remains rather stable, it is 
evidence that the distribution is normal enough for the use of the 
tetrachoric method. There is no way to judge the normality 
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/ 

Substituting in formula (109), 
"" ( 14) (.43) ~ (ipoo) 


/nfi snm^ - (2)(.14)(43)[(1.949)(5,735) - (5,465)(3,651)] \ 

^ (.395) (.364) /' 

1 17,000\ 

" .0602 \ 16,800/ 


ri 

Tt = — 21 


From Table 63, we see that 65 per cent of relief households 
have four or more persons, compared, with only 51 per cent of 
nonrelief households. We therefore say that the degree of 
economic independence of a household is to a slight extent nega- 
tively correlated with the size of the household, as shown by the 
value n = —.21 

A quick method of finding the value of n is provided by L, 
Chesire, M. SaflG^r, and L. L. Thurstone’s Computing Diagrams 
Jot the Tetrachoric Correlation Coefficient We shall use one of 
these diagrams (Fig. 50) to test the normality of the size-of- 
household series in Table 62 by recomputing n after shifting the 
dichotomous line of division from three- to filve-person house- 
holds. The new groupmgs are shown in Table 64. The fre- 
quencies are reduced to proportions of the table total, 16,800, by 
multiplying the reciprocal of 16,800, 


leio - 

into each cell frequency. The proportions are entered in Table 65. 
We now take any row or column total that is not greater than 
.500 as a, any other column or row total at right angles to it as 5, 


Table 64 — Number of Rural Relief and Nonreliep Households 
Containing Less than Six Persons, and Sex Persons and Over, 


October, 

1933 



Size of Louseliold 

Frequency 

Rehef 

, Nonrehef 

Total 

6 persons and more 

5 persons and less . 

All households 

1,971 

3,629 

5,600 

2,352 

8,848 

11,200 

4,323 

12,477 

16,800 
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Table 65 — FaEgirENCiES of Table 64 Hedtjceb to Pkopobtions op the 
Table Total, for Use with Chesire, Saffie, and Thurstone's 
C oMPiTTTNG Diagrams 


Size of household 

Frequency 

Rehef 

Nonrehef 

Total 

6 persons and more 

- 117 

-f 140 

257 

5 persons and less ... 

+ 216 c 

- 527 

743 = 6 

All households 

333 = a 

.667 

1 000 


and the proportion in the cell common to the a row (or column) 
and the b column (or row), as c. One set of these letters is 
indicated in the table. From Fig. 50, the diagram for a = .33, 



(From L Chesire, M Saffir, and L L Thurstone, Computing Diagrams for the 
Tetrachoric Correlation Coefficient^ University of Chicago Bookstore, Chicago, 
1933 ) 

we find at the intersection of the orthogonal^ lines representing 
h = .74 and c = .22 a value = j- 23. If in Table 65 c falls 
in a positive quadrant, n has the sign shown in the diagram; but 
if c is in a negative quadrant, the sign indicated in the diagram 
is reversed. The signs of the quadrants are marked in Table 65, 
where it is seen that c is in a positive quadrant. Therefore, 

^ At right angles. 
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— .23, which agrees closely with the value of n computed for 
Table 63 with a different division of the dichotomy for size of 
households- So far as this test goes, then, the table seems to be 
normal enough to permit the use of the tetrachorie method. 
The test should be made for other points of division on the 
size-of-household scale, but would stm be incomplete because 
new subdivisions cannot be tested in the relief-nonrelief series 
also.^ 

It should finally be observed that a fourfold correlation table 
includes some of the basic elements of experimental design. 
Thus, in Table 61, we have an independent factor or treatment, 
Agriculture; a dependent factor. Imprisonment for Crime; an 
experimental group, the Prison population; and a control group, 
the Nonprison population. On the other hand, dichotomies are 
used instead of classes based on measurement. In Table 61, 
sex and (roughly) age have been held constant, and there is 
nothing in the method that precludes as rigorous factor control 
as seems worth while. Even the broad 2X2 table may, there- 
fore, be a valuable analyiiical device. 


1 If it is needed to determine the value of the tetrachonc coefficient, rtj 
very precisely, the complete formula may be seen in several texts, e,g., 
Davenport and Ekas, Statistical Methods in Biology, Medicine and Psychol- 
ogy, 4th ed , pp. 105-106, or Peters and Van Voorhis, Statistical Procedures 
and Their Mathematical Bases, p. 370; and helpful tables with explanations 
are given by Karl Pearson, Tables for Statisticians and Biometriaans, 3d ed., 
Part I, pp xxxvi, xlui, 1, hn, 31, 32, 33, 34, 42-52, 52-57, Part II, pp. xhv, 
73, 74 Formulas have been derived for the standard errors (see Chap XII) 
of biserial r, the coefficient of tetrachorie correlation, and the coefficient of 
contmgency. The standard error of the coefficient of contingency, C, is 
hardly needed if the value of for the contingency table is referred to a 
table of x% as was done above in the section on this coefficient. The formula 
may be seen, however, in such texts as Holzinger, Statistical Methods for 
Students in Education, p 278. The standard error of the tetrachonc corre- 
lation coefficient, rt, is also given by JDavenport and Ekas, op cit , p. 108, 
and by Peters and Van Voorhis, op. cit , p. 371. G. XJ. Yule and M. G. 
KendaU, An Introduction to the Theory of Statistics, p. 408, show the formula 
for the standard error of Somewhat simpler are the standard error 
formulas of rbi* and Q: 
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Exercises 

1. What are the chief disadvantages in correlating qnahtative senes, 
as compared with quantitative senes? 

2. What prehminary test should be made of a quahtative table 
before applying correlation to it*^ 

3. What IS the amount of correlation between type of college traimng 
and success m teaching in the following table 


Two Htindbed High School Teachers Classified by Type op College 
PEOM Which They Graduated, and by Success in Teaching 


Institution 

Successful 

Unsuccessful 

Total 

Teachers college 

58 

42 

100 

Umversity or college 

49 

61 

100 

Total 

107 

93 

200 


Defend your choice of a coefficient, and explain the meamng of your 
results. 

4. How much association, if any, is there between the sex of distin- 
guished people and the socioeconomic class of their fathers m the 
table below*? 

Famous British Men and Women Classified by Social Origin* 


Socioeconomic class of father 

Men 

Women 

Nobleman 

1,059 

108 

Gentleman 

724 

83. 

Pohtician, lawyer 

666 

61 

Soldier, sailor 

490 

53 

Divme ... 

1,100 

57 

Teacher 

274 

23 

Physician . . 

396 

35 

Administrator . 

194 

12 

Writer, artist . 

371 


Busmessman . . 

929 

95 

Artisan. 

446 

38 

Laborer, servant 

81 

8 

Agriculture 

270 

18 

Total 

7,000 

700 


♦Adapted from Table 4, p 70S, Joseph Schneider, Class Ongin and Fame. Eminent 
English Women, American Sociological RevieiOf Vol 5, pp 700-713, 1940. 
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Is the association positive or negative? What does the association 
mean in terms of this problem? What should be done about the correc- 
tion for broad grouping in this case? Is the value of C significantly 
greater than zero? Explain what this means. 

5. What IS the amount of association between the sex of a sample of 
undergraduate students at the University of Wisconsin m 1938-1939 
and their state of residence? What coefficient is most appropriate to 
this problem, and why? Interpret its meaning. 


A Sample op Underghaluate Students, University op Wisconsin, 
1938-1939, Classified by Sex and by State of Residence 


State of residence 

Male 

Female 

Total 

Wisconsin 

94 

44 

138 

Other 

17 

27 

44 

Total 

111 

71 

182 


G. Find the amoxmt of association between type of offense and body 
build m the table: 


Criminals Classified by Type of Offense and Body Build* 


Body 

build 

First- 

degree 

murder 

Second - 
degree 
murder 

As- 
sault : 

Bob- 

bery 

Burg- 

lary 

and 

lar- 

ceny 

For- 

gery 

and 

fraud 

! 

Rape 

Other 

sex 

Vs- 

public 

wel- 

fare 

Arson 
and all 
other 

To- 

tal 

Slender. 

42 

79 

7 

54 

213 

67 

18 

18 

31 

B 

526 

Medium. 

155 

358 

49 

244 


260 

119 

80 

127 


2467 

Heavy . 

77 

147 

18 

81 

302 

no 

44 

46 

77 

■9 

917 

Total ... 

274 

584 

74 

379 

1519 

427 

181 

144 

235 

93 

3910 


* From. A E Hooton, The American Criminalt VoL I, Appendix, IX-8, Harvard Umver- 
sity Press, Cambridge, 1939. 


Is the value of the coefficient significantly greater than zero? What 
does the coefficient mean here? 

7 . What is the amount of correlation between the age distnbutions 
of females in the neighboring urban and rural counties in the accom- 
panymg table of age distnbutions? 
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Age Distkebutions of Feiliees in a Etjbal (Ruthbrfokd) and a Neab-bt 
Urban (Mecklenbtjeg) Countt in North Carolina, 1930* 


Age, years 

Rural county 

Urban county 

Under 5 

2,553 

6,542 

5-9 

2,846 

7,311 

10-14 

2,428 

6,424 

15-19 . 

2,247 

6,751 

20-24 

2,109 

7,862 

25-29 . . 

1,579 

6,990 

30-34 

1,201 

5,277 

35-44 . . 

2,202 

8,288 

45-54 

1,520 

5,199 

55-64 

932 

2,548 

65-74 .. 

515 

1,342 

75 and over 

237 

586 

Total 

20,369 

65,120 


* From Fifteenth. Census of the United States, 1930, Bureau of the Census 
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CHAPTER XII 


SAMPLING AND SAMPLING ERRORS 

1. Definitions. — In sociological research, it is seldom possible 
to study more than a part of the whole, or universe,'^ in which 
we are interested. For example, if it is wanted to know whether 
the educated or the uneducated in the United States have the 
higher birth rate, it would be impractical to find the birth rate 
of the millions in each class. A sample would have to be taken 
of each group, and the birth rates of the two samples compared. 
If the samples were large and properly taken, the sample birth 
rates should be rather close to the true rates for the total edu- 
cated and uneducated in the countrj^ 

A value (e ^ , a mean) found from a sample is called a Btatistic, 
whereas the corresponding true or expected value in the umverse 
is called a parameter. The primary purpose of all samphng is 
to learn something about a umverse, often to estimate the value 
of a parameter from the value of a statistic There is seldom 
any interest in a sample or in the value of a statistic for its own 
sake. A good sample is, therefore, one that jdelds rehable 
information about a universe. 

The first step in sampling is to define the universe to be 
sampled. Thus we might define the umverse of the educated 
as consisting of all married couples in the United States h\ing 
together through the year 1939 who had successfully passed at 
least the first year of high school; and the umverse of the unedu- 
cated as corresponding couples who had less schooling than 
this, the birth rates to be compared as of the year 1939. The 
sociological universe should usually be defined in both space 
and time 

Since the universe is made up of events, a definition of the 
event is also necessary. In our illustration above, the event is a 
married couple with a birth or a married couple wdthout a birth 
during 1939. There are thus two kinds of events, couples with 

^ Synonymous terms often used are 'population and parent, 
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a birtli, which, may be called successes, and couples without a 
birth, which may be called fmlures. The word success” 
merely designates that particular event among two or more 
different kmds of events in which the investigator is chiefly 
interested. If we were sampling farmers to find their net annual 
incomes m 1939, the event would be a farmer^s mcome, and would 
represent a continuous, measured variable In the case of a 
measured variable, there is, of course, no dichotomy of success 
and failure, but merely a number of specific values. 

A universe may consist of an infinite or of a finite (limited) 
number of events. If the number of events is very large, the 
universe may be regarded as infinite for practical purposes. 

The events in a universe may already have happened, or may 
be yet to happen. In the former case they are said to be existent; 
in the latter, hypothetical. In our illustration above, at the 
beginning of the year 1939 none of the events (a birth or 
the absence of a birth to a married couple) has happened; at 
the end of the year, all of them have happened. Similarly, heads 
or tails is a hypothetical event before tossing a penny, an existent 
event after tossing. When the universe to be sampled consists 
entirely of completed events, the universe is said to be exist- 
ent; when it consists entirely or partly of events yet to come, 
it is said to be hypothetical Prediction, with which social 
science must be concerned, is of course possible only with 
respect to hypothetical universes, since we do not “predict” 
past events. 

It is also important to notice whether the universe to be 
sampled is to be regarded as a unique, historical set of events 
(situation), as a constant or recurrent situation or system of 
causes, or as a changing situation. If we are interested in the 
death rate from the influenza epidemic of 1918, we have a unique 
universe. But if we attempt to predict the rate of mortality 
in Chicago, we assume a continuous or recurrent, i.e , essentially 
unchangmg, universe. As a matter of fact, strictly continuous 
or recurrent umverses never occur in social research, since there 
is constant change in the complex of factors that compose any 
social situation. The important question, therefore, is whether 
the universe can be expected to be approximately recurrent, or 
unchanged, over a period m which we are interested. If so, we 
may be justified in trying to predict what will happen in that 
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period on the strength of what has occurred. It is sometimes 
possible to discover the nature, direction, and rate of change in 
a changing universe, so that w^e can allow for it in making a 
prediction. 

Finally, "we shall find it worth while to distinguish between 
homogeneous and heterogeneous hypothetical umverses. A uni- 
verse is homogeneous when each hypothetical event has the 
same a priori probability of becoming a success or a specified 
value of a variable, it is heterogeneous when this probability 
is not the same for each hypothetical event. A homogeneous 
universe derives from a single set or s^^stem of causes, a heteroge- 
neous universe from two or more distinct sets of causes, as judged 
by their effects on the hypothetical events in which we are inter- 
ested When an insurance company sets up a class of “risks/' 
composed of, say, males, native white, married, in the legal 
profession, aged 25, class “A" medical examination, living in 
Michigan, the company is trying to create a homogeneous 
umverse Every person or hypothetical event admitted to the 
class must be judged alike in respect to certain characteristics 
that are believed to be related to the event, death. In other 
words, each member of the class must have the same apparent 
chance of death. In this way, and by requiring that the condi- 
tions of life for the class must go on essentially in the future 
as in the past (e p., in case one of the insured persons enlists in a 
war, his contract may be modified or canceled), some likelihood 
is created that the system of causes affecting the mortahty rate 
of the class will continue each year about the same as the year 
before, except for chance factors. If, however, a number of 
men aged 65 were to be admitted to the risk class onginaUy 
composed of men aged 25, heterogeneity in the hypothetical 
events would at once be introduced. While such a mixed or 
heterogeneous umverse might be recurrent if the proportion 
of the two ages were kept constant, it could no longer claim to 
be homogeneous, because the chance of death is known to be 
different for a man aged 25 and a man aged 65 In practice, 
just when a hypothetical umverse may be considered homo- 
geneous is a matter of information and of degree Of course, no 
two persons in a life-table category actually have exactly the 
same chance of death The more completely the causes that 
are related to the success are controlled and equated from event 



224 


ELEMENTARY SOCIAL STATISTICS 


to event, however, the more accurate and reliable the prediction 
from the sample will tend to be, within limits. Where to stop 
the effort to increase homogeneity is a question of judgment and 
expediency. The more homogeneous the categories of any 
classification are made, the greater their number, and the fewer 
the events that will fall in any one category. While it is usually 
advisable to sacrifice the size of the sample for the sake of homo- 
geneity to a certain pomt, diminishing returns set in if the idea 
is carried too far 

2. Taking the Sample. — The events in a sample may be drawn 
from the universe (1) at random, (2) at regular intervals, (3) 
at random from different strata or subclasses of the umverse, or 
(4) according to some purposive scheme, such as from the middle 
and ends of a distribution. Thus, (1) we might draw marriage 
certificates at random from an alphabetical hst of all of the 
marriage certificates in a file, (2) we might take every fifth 
certificate in order, (3) we might draw a proportional number of 
certificates at random from each separate county and city hst, or 
(4) we might take certificates from the top, middle, and bottom 
of the list. The most common method of taking a sample, and 
the one to which most of the statistical theory of sampling apphes, 
is the random. The method of sampling at random propor- 
tionally from within strata — e g , marriage certificates taken at 
random from each county file — ^is more representative than 
random sampling from the total umverse — e g , certificates 
taken from a grand list Unfortunately, however, the samphng 
errors of only a few statistics are available in the case of stratified 
sampling Purposive samphng is seldom as reliable as either 
of the other two methods, and difficulties of determining samphng 
errors are encountered. We shall deal here primarily with 
random samplmg, but shall introduce stratified samphng for a 
mean and for a proportion. 

A sample is random when at any given draw or trial, considered 
alone, every existent event has an equal chance of being taken, or 
every hypothetical event is equally likely to occur. In other 
words, in a random sample the chance of being drawn or 
“ thrown is mdependent of the character of the event. In 
addition, a simple sample — also called a Bernoulli sample, after 
a French mathematician who studied it — requires that the 
probabihty of drawing or throwmg a success or a specified value 
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shall remain the same from one draw or trial to another.^ It is 
theoretically possible to random sample any universe, but a 
simple sample can be drawn only from an infinite universe. 
Suppose we have an existent universe of 1,000 marriage cer- 
tificates and wish to take a random sample of 100. Suppose, 
further, that 60 of these marriages have ended in divorce. At 
the first draw, the probabihty of taking a marriage that has 
ended in divorce is xwoir? 0.060. If we happen to draw a 
divorced marriage at the first trial, the probability of getting 
a divorced marriage at the second draw will be or 0 059; 
otherwise it will be or 0.06006. Now if the certificates are 
drawn entirely independently of the question of divorce, at any 
draw one certificate will have as good a chance of being taken 
as any other in the file, and the resulting sample will be random. 
But because the probability of drawing a given event, say, a 
divorced marriage, changes from one draw to the next in the 
li mi ted umverse of 1,000 certificates, the sample will not be 
simple. If, however, after each draw of a certificate from 
the file of 1,000, its number is recorded in the sample and the 
certificate is returned to the file, the number of certificates in 
the file ■will remain constant and the probabihty of drawing a 
divorced marriage will not change from draw to draw. By the 
act of replacement the universe becomes infinite. Of course, if 
we happen to draw the same certificate more than once, it will 
have to be accepted in the sample each time it is drawn, if it 
is wanted to maintain an infinite universe. 

In the case of a hypothetical universe, a simple sample of 
hypothetical events can be drawn only if the universe is homo- 
geneous, like a life insurance risk group All the causes that 
determine the chance of death that the actuaries have been 
able to consider must be the same for each individual in the 
group. If a random sample were drawn from a mixture of 
two different risk groups, so that, say, persons of different 
sexes were included in the sample, the chance of death would 
not be the same from one hypothetical event to another (person 
to person), and the sample wo'uld be further removed from a 
simple sample. 

1 But it IS not assumed that the probabihty is the same for different kinds 
of events or different values, e g., a person of age 25 and a person of age 30 
m a universe of ages. 
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It follows from the preceding definitions that a simple or 
random sample of an existent universe is not necessarily a 
simple or random sample of the hypothetical universe from 
which the existent universe was derived. But when the existent 
universe itself is a simple or random sample of a hypothetical 
universe, a simple random sample of the existent universe will 
be a simple or random sample of that hypothetical universe also. 

In practice, it is not easy to obtain a random sample The 
most manageable case is that of a limited existent universe each 
of whose events can be individually identified, such as the list 
of marriage certificates mentioned above. If we take certificates 
or pages of certificates at regular intervals from an alphabetical 
or other random list, say every twentieth certificate or page, the 
first page bemg chosen at random, the sample should apparently 
be random, because there is no obvious connection between this 
order and the information on the marriage certificates. If 
the interval is not too large, this method should also be more 
representative than other types of random samphng, since 
it takes certificates proportionately from every part of the 
list. There are many other devices for takmg a random sample 
from a hst. One of the commonest is to number the items, 
place correspondmg numbers on tickets in a box, shuflfte them, 
and draw. Experience has shown, however, that methods 
like the above will not always yield a random sample. Mathe- 
matically, the ideal plan is to draw the sample from a table of 
random numbers, such as L. H. C. Tippett^s Random Samphng 
Numbers,^ which are combinations of digits taken at random 
from census reports. A specimen page of these numbers is 
shown below (Fig. 51). 

Imagine that we wish to take a random sample of 200 marriage 
certificates from a list of certificates in a state file. The cer- 
tificates are filed and numbered consecutively, so that any nth 
certificate from the begmning of the hst can be quickly located. 
The smallest number printed in the table is 0,000 and the largest 
is 9,999. If the number of events in the universe is close to 
10,000, we can simply go down each column of four figures in 
the table, taking for our sample the first 200 numbers within 
the range of our universe that we meet. When the universe is 

^ Tracts for Computers^ Number XV, Cambridge Umversity Press, London, 
1927 , 
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00^3 


2 9 5 2 6641 3992 9792 

4167 9524 1545 1396 

2730 7483 3408 2762 
0560 5246 1112 6107 

2754 9143 1405 9025 

5870 2859 4988 1658 
9263 2466 3398 5440 
2002 7840 1690 7505 

9568 2835 9427 3 6 6 8 
8 2 4 3 1 5 7 9 1930 5026 

5667 3513 9270 6298 
1018 6891 1212 6563 

6841 5111 5688 3777 
2041 2207 4 8 8 9 7346 
5565 4764 2617 5281 

4508 1808 3289 3993 
2152 6473 5692 9309 
6917 4113 7340 6853 
8241 4124 4131 9500 
7913 3 7 0 9 5 9 4 4 9763 

9385 7125 3230 0737 
3436 6293 6025 9384 
9094 1634 5070 0664 
9226 9296 2796 7097 
7 7 8 1 3760 2895 7653 

9742 9694 7347 0017 
9420 9210 8787 9375 
1179 3571 5992 3059 
0708 4011 4057 1550 
6350 3996 3795 2176 

1414 7152 3658 1636 
7041 8985 7011 5 6 7 6 

3 2 4 3 2 7 8 3 0840 9 0 5 4 
7 9 2 2 4931 5753 6160 
8769 3513 8976 0780 

2510 7274 8743 0000 
0224 2404 9811 6641 
3009 8516 7245 9409 
7489 0221 7921 2351 
5188 1825 2220 9382 

1198 2545 2482 9607 
3908 4676 7816 6 5 1 7 
1094 2223 1675 2282 
1817 7723 5582 7153 
6208 9598 9623 2114 

1752 4519 2749 8020 
0486 6993 3115 5 0 2 5 
1942 3004 1442 2810 
4930 9785 7460 3996 
2349 1594 7152 0257 


7979 5911 3170 5624 
7203 5356 1300 2693 
3563 1089 6913 7691 
6008 8126 4233 8776 
7002 6111 8816 6446 

2922 6166 6069 2763 
8738 6028 5048 2683 
0423 8430 8759 7108 
2596 8820 1955 6515 
3426 7088 3991 7151 

6 3 9 6 7306 7898 7842 
2201 5013 0730 2405 
7354 3434 8336 6424 
286 5 1550 5960 5479 
1870 6497 5744 9576 

9485 4240 2835 9 9 5 5 
7661 1668 5431 7658 
1172 7229 1279 5085 
5657 3932 5942 3317 
2 7 5 5 4 2 1 1 4 9 9 5 8657 

2957 1013 6369 4494 
3343 1071 1468 4801 

6510 0918 4601 4294 
4057 2074 6297 2587 
0091 7012 1308 1096 

9572 1850 0116 1899 
4663 0396 6717 5562 
9015 5608 2348 8144 
1674 1376 5243 4427 
8182 4514 6349 3483 

0638 3443 4440 3086 
7570 6685 1776 3154 
8862 5173 8433 9117 
6566 8602 3423 9074 
6382 0029 2619 5982 

1850 2408 3602 5179 
9732 1 6 6 2 9 1 5 8 1 4 0 4 
2 8 4 4 0 7 1 7 1072 3137 
2696 4906 2484 3 8 6 8 
0532 1915 1790 2081 

0067 3744 9866 5096 
9121 3171 4119 3615 
3712 8191 1330 1454 

9518 0231 7782 5742 
7747 2096 5027 0561 

4642 1190 7302 8350 
4887 1571 9819 6804 
1479 0970 7302 3775 
2864 0559 3985 8092 
4041 4105 3180 9806 


Fig. 51 — Specimen page of random sample numbers. {From Tracts for 
Computers, Number XV, Random Sampling Numbers, ed. by Karl Pearson, 
arranged by L. H, C, Tippett, p 1.) 


228 


ELEMENTARY SOCIAL STATISTICS 


muck smaller than 10,000, it sometimes saves time to assign 
several table numbers to each event. For example, if there are 
only 2,000 events (e.g., marriage certificates) in the umverse, we 
may assign to event number 1 the table numbers 0 through 4, 
to event number 2 the table numbers 5 through 9, and so on. 
If then we read, say, the number 0061 from the table, we draw 
event number 13 from the universe (-^ + 1 == 13).^ The same 
event is accepted only once from a limited universe, regardless 
of how many times it may be drawn. Also, the number of digits 
to read in the table may correspond to the number of digits 
needed to express the total of events in the universe. Thus, if 
the umverse contams 800 events, we may draw three-digit 
numbers, e g , 295, 016, 273, and so on; if the number of events is 
600,000, we may draw six-digit numbers, such as 295,266; 416,795, 
003,074. The table may be read in any direction or order. 

When the individual events of a limited universe cannot be 
identified and labeled, probably the next best thing is to identify 
groups of them, usually on a geographical-time basis Thus a ^ 
random sample of the farmers of a state, of whom there is no list, 
may be obtained by numbering each township in the state and 
drawing a random sample of townships by one of the methods 
suggested above. Then each township drawn in the sample 
may be visited at a given date, and all the farmers in it taken m 
the sample. Or, if necessary, the sample townships may be 
divided into school districts and a random sample of these dis- 
tricts drawn before going to the events (farmers) themselves. 
When this approach has to be used, the number of groups com- 
posing the universe should be as large as practicable, while the 
number of events in each group should be a minimum and 
nearly equal from group to group. For example, if the township 
is the smallest unit for which data are available, a township with 
a large population may be subdivided and represented by two or 
three tickets, instead of by one, in the drawing, so that the 
probability of drawing a township will be roughly proportional 
to the size of its population. 

In deahng with an infinite or a very large universe, it is of 
course not possible to list and label all the individual events, but 

^ In tins case, to find the serial number, if the table number is not already 
an exact multiple of five, reduce it to the nearest multiple of five, divide by 
five, and add one. 
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it may be feasible to use the group method mentioned in the 
preceding paragraph. For example, if a physical anthropologist 
wanted to sample the white race, he might divide the countries 
occupied by the various branches of this race into small geo- 
graphical areas, number them, and draw them at random. He 
would then probably have to go to each of the areas drawn, 
further subdivide them, draw a random sample of the small 
subdivisions, and then finally perhaps take a random sample of 
the individuals living in each subdivision. 

Such a plan as the above, however, is not adapted to a hypo- 
thetical umverse, like the number of heads or tails that might be 
thrown with a penny, or the number of divorces that might occur 
in the Umted States over some future period of time. The only 
way to draw a random sample in this case is to define a set of 
conditions, or causal system (e g., social conditions in the United 
States, Jan 1, 1940 to Jan. 1, 1941), draw at random a number 
of hypothetical events that satisfy the conditions (couples 
married on Jan. 1, 1940), and let the system act to convert them 
to existent events (couples divorced, not divorced on Jan. 1, 
1941) ; or else wait until the system has produced a large number 
of existent events (couples married Jan. 1, 1940, after Jan. 1, 
1941), and then draw at random as many of them as are needed 
for the sample. In either case, if a simple sample is wanted, it 
is, of course, necessary to make sure that the existent events 
(couples divorced or not divorced Jan. 1, 1941) were derived 
from hypothetical events (couples married Jan. 1, 1940), each of 
which had (on Jan. 1, 1940) essentially the same a priori prob- 
abihty of becoming a success (divorced couple) throughout 
the experiment (Jan. 1, 1940 to Jan. 1, 1941), except for chance 
factors. Notice also that a causal system that does not act 
uniformly over its time cycle must furnish sample events from 
the whole of its cycle, to avoid important omissions. For exam- 
ple, in determining the death rate of infants during the first year 
of life, observations should extend over the complete period of 
12 months, because the death rate is subject to seasonal variation. 

If a heterogeneous hypothetical universe, i.e,, a hypothetical 
umverse in which the chance of success is not the same from one 
hypothetical event to another {eg,, a class of life insurance risks 
of different ages, where p is the probability of an individuaFs 
death within the year, and a hypothetical event is a person taken 
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at the beginning of the year), exists without important change 
over a period of time, then a random sample drawn from a large 
number of the events at the end of this period will yield an esti- 
mate of the mean probabthty of death for the mixed class. 

When a imiverse is divided into strata with respect to some 
trait, a proportional^ simple subsample is taken from each stra- 
tum, and these subsamples are combmed, the resulting total 
sample is called a Poisson sample, in honor of the French mathe- 
matician who described it. Thus, for the purpose of drawing 
a Poisson sample, an existent umverse of family incomes in New 
York City in 1939 may be divided into the classes Under $500, 
SSOO-SOOO, $1,000-$1,499, . . . ; or, supposing that we are 
interested in divorce, we may define a hypothetical universe of 
ever-marned women in the city of Philadelphia on Jan. 1, 1940, 
consisting of subgroups whose members are alike in respect to 
occupation of husband, presence or absence of children, religious 
affiliation, length of time since marriage, and so on. If all the 
requirements of Poisson sampling are to be met, each stratum 
must constitute an infimte subuniverse In the case of our 
existent umverse of family incomes, this will be approximately 
true if the number of incomes in each class is very large Any 
hypothetical universe or stratum may be regarded as infinite on 
the assumption that the defined set of conditions theoretically 
acts to produce events without limit. For example, it may be 
reasoned that the conditions that produced a certain percentage 
of divorces among a group of Philadelphia women whose hus- 
bands were skilled laborers, who had borne children, who were 
Protestants, who had been married five years, and so on, might 
continue indefinitely to produce the same percentage of divorces 
(except for random errors) among women of this description. 
As a rule, however, it is more realistic to consider how long we 
may expect a hypothetical universe actually to persist without 
important change, and then decide whether the probable num- 
ber of events that will be produced in a given stratum within 
that period may be regarded as infinite for practical purposes. 
To refer again to our illustration, we might conclude that the 
set of conditions responsible for the divorce rate observed in the 
class of women defined above would probably remain essentially 

1 Preferably also weighted by the value of the standard deviation of the 
stratum or subgroup. 
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the same no longer than perhaps a decade, but that in 10 years 
several thousand women would come within the class, a number 
great enough to be regarded as infinite without noticeable error. 

If a simple sample is drawn from only one of several strata 
forming a universe, and from it an attempt is made to judge the 
whole universe, the sample is called a Lexis sample, after a Ger- 
man statistician The sampling error of a Poisson sample is 
less than that of a random sample, while that of a Lexis sample is 
greater. A Lexis sample is seldom taken intentionally, but may 
occur when some important part of the universe is omitted from 
the sample ^ The Poisson sample, on the other hand, is the 
most representative sample that can be taken of sociological 
data, and should be used much more than it now is. 

What has been said above about the samphng of an attribute 
(an unmeasured quality called an event, such as the survival or 
death of an insured person) applies equally to the sampling of a 
variable (a measured quality, such as the net annual income of a 
farm family). In the case of sampling a variable, the parameter 
in which we are usually interested is the mean of the values in the 
um verse (e g , the mean net annual income of the farmers in 
Nebraska), although it may be the standard deviation, a cor- 
relation coeflSlcient, or other index. 

When the purpose in taking a sample is to use the proportion, 
mean, or other statistic from the sample as an estimate of the 
corresponding parameter in the universe, it is needed to know the 
range of error in the estimate due to sampling. This can be 
found only if the sample is approximately of some standard type, 
such as random, simple, or Poisson. Thus, if we find from a 
simple sample of juvenile delinquents that 21 per cent were 
from broken homes, we are able to estimate with the aid of the 
mathematical theory of sampling that the chances are, say, 19 
to 1 that certain limits, say 15 to 27, will enclose the true per- 
centage from broken homes in the universe from which the sample 
was taken. If the nature o'* the sample is uncertain, so that we do 
not know that it is, say, simple or Poisson, we cannot apply the 
appropriate formulas for finding the errors of samphng, and so 
cannot gauge the amount of error in any statistic estimated from 

1 See Brtjcb D Mitdgett and S R Gevorkiantz, Reliability of Forest 
Surveys, Journal of the American Statistical Association, Vol 29, pp 257-281, 
1934, 
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the sample. The chief assurance that we can have about the 
nature of a sample must come from a knowledge of the method 
by which it was taken. Thus, we must know that the conditions 
of, say, simple sampling were at least broadly met in drawing the 
sample before we can safely treat it as a simple sample. 

3. The General Theory of Sampling. — In general, the theory 
of sampling that provides a basis for the measurement of sampling 
error is as follows. Suppose that we draw a large number, N, 
of random samples of equal size, n, from a um verse of juvenile 
delmquents, and list the number of dehnquents with broken 
homes (successes) in each sample. We shall then have a table 
hke Table 66. 

This is a sampling distribution of the number or frequency of 
successes per sample. We may find its standard deviation by 
the familiar formula. 


where X is the number of successes per sample, Mx is the mean 
number of successes per sample, /is the number of samples having 
a given number of successes, and N is the total number of 
samples. Since this is the standard deviation of the number of 
successes from many actual samples, we may call it an empirical 
standard error, to distinguish it from the standard deviation of a 
series where the question of sampling does not enter. We may 
further call the standard error of this formula the empirical 
standard error of the number of successes per sample, to differentiate 
it from the standard error of, say, a mean or correlation 
coefficient. 

An empirical standard error like the above has the disad- 
vantage, however, that it is itself a sample value that is affected 
by the number of samples taken, and varies because of random 
errors of sampling. Mathematicians are able to calculate a more 
exact or theoretical standard err or, ^ provided they are allowed to 
specify the nature of the distribution of the universe values and 
the conditions under which the sample is taken. This enables 
them to lay down requirements which ensure that the parameter, 

^Or probable error, if preferred From Chap IX pp 160 and 161, it will 
be recalled that the probable error is related to the standard error by the 
equation P,E, = .6745<r, where <r = e in our subsequent notation. 
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say, a frequency, will be distributed in the samples according to 
some established mathematical principle, such as the binomial 
theorem or the normal curve. 

Table 66 — Distribution of Broken Homes per Sample of 50 Juvenile 


Delinquents in 

100 Random Samples 

Broken Homes 


per Sample 

Samples 

0 

0 

1 

0 

2 

0 


8 

1 

9 

1 

10 

2 

11 

5 

12 

9 

13 

10 

14 

12 

15 

12 

16 

11 

17 

8 

18 

9 

19 

5 

20 

6 

21 

4 

22 

3 

23 

1 

24 

0 

25 

1 

26 

0 

27 

0 


48 

0 

49 

0 

50 

0 

Total 

100 

Some of the commonest of the standard error formulas that 
are applied in the sampling of attributes assume that the samphng 

distribution is binomtal in type 

It will be recalled^ that N times 


iSee Chap IX. 
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the binomial expansion shows how many of N random samples of 
n events each may be expected to have given numbers of suc- 
cesses from 0 to Uj where the probability of success remains the 
same from event to event. If it can be shown that these require- 
ments were at least approximately comphed with in drawing the 
events of a sample, we may assume that the sampling distribution 
will be approximately binomial in form. The standard deviation 
of the binomial is well known, and is then the theoretical standard 
error that we are seeking: 

fff = 

where p is the constant chance of success in the binomial universe, 
q = 1 — Pj n is the number of events in a single sample, and 
/ is the frequency. We have only to substitute in this formula to 
get the standard deviation (called standard error) of the sampling 
distribution. 

In the investigation of sociological attributes, however, there 
is usually available only one sample, rather than a distribution 
of many samples In that case, if the sample was taken under 
binomial conditions^ and its size is large, the best estimate of p, 
the proportion of successes in the umverse, is the proportion of 
successes in the single sample. This estimate of p is then used 
in the above formula to compute an approximate theoretical 
sampling error 

It will be noticed that the binomial theorem merely repeats 
the requirements of the simple sampling of attributes, which we 
have seen can be met only if an existent umverse is infinite and 
well mixed, or a hypothetical universe is homogeneous. Because 
of the difficulties in taking a simple sample under many conditions 
in sociological research, it is fortunate that the standard errors 
of simple samples are usually not very different from those of 
random samples, and in any case are somewhat larger For 
these reasons, investigators often apply simple sampling errors 
to random or even stratified samples, in order to save labor or 
to be on the conservative side when m doubt as to what the 
error formula should be. 

4. Only Large Samples Considered. — In sociological inves- 
tigations, many factors that the sociologist is unable to control 
usually cause small samples to differ radically from one another. 
Small samples are, therefore, not often used in social research 



SAMPLING AND SAMPLING ERRORS 


235 


For this practical reason and for the sake of simplicity, the discus- 
sion in this book is limited to large samples. As a rule, the 
standard error formulas given may become rather seriously 
inaccurate if apphed to samples with fewer than 20 to 25 items, 
and are safest when used with much larger samples.^ 

6. Standard Error Formulas, a The Standard Error of a 
Frequency , — As just shown, for a simple sample, the formula 
for the standard error of a frequency is 

1 _ (113) 

V 

where p is the constant probabihty of drawing a success at any 
single draw, ^ = 1 — p, and / is the frequency in question, 
p refers to the probabihty in the universe, ^ e , to the true or 
expected probabihty, and / to the true frequency, but they are 
usually estimated from the sample when the latter is large 
We shall illustrate the use of this formula by application 
to the age distribution of an approximately simple sample of 
unemployed in New York City in 1930, shown in Table 67. 


Table 67 — The Disthibution of TIntemploted Peesoxs bt Age, in* a 
Simple Sample of 100 Unemployed in New Yobk City, 1930 


Age, years 

Unemployed 

(/) 

d 

id 


15-24 

28 

-2 

-56 

112 

25-34 

23 

-1 

-23 

23 

35-44 . . 

21 

0 

0 

0 

45-54 

16 

1 

16 

16' 

55-64 

9 

2 

18 

36 

65 and over 

3 

3 

9 

27 

Total 

100 


-36 

214 


What IS the range of error in the sample estimate of the relative 
number of unemployed in the age class 15-24? If we assume 
that the umverse of the unemployed in New York is existent 
and large enough to be regarded as infinite for our purposes, and 
that it does not change appreciably during the process of sam- 
phng, then the probability of drawing an unemployed person 
in the age class 15-24 should be constant from, draw to draw, 
^ See Sec. 6. 
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and from one sample of 100 persons to another. Thus the 
requirements of simple samphng are met, and we may determine 
the error of sampling of the frequency by formula (113). Since 
w is as large as 100, we accept 28 as an estimate of the true 
frequency in the age class 15—24. Substituting in formula (113), 

€28 = ■\/28(l — 3^), 

€28 = 4.5 

In a large number of such samples the frequency in the age 
^ class 15-24 is approximately normally distributed, and the 
standard error just found is an estimate of the standard deviation 
of that normal distribution. Under these conditions, about two 
times in three the true frequency in the age class 15-24 will be 


Y 



the age class 16-24 years will be enclosed about 19 times out of 20 (m the long 
run), as determined from a simple sample of 100 imemployed in New York 
City. 1930. 

mcluded within one standard error above and below the sample 
frequency of 28. That is, about two times out of three, we 
should expect the true number of unemployed persons in the age 
class 15-24 to be contained between the hmits 28 ± 4 5, or 
between 23.5 and 32 5. If we want more security than this, we 
may multiply the standard error by two, getting limits of 
28 ± 2(4 5), or 19 to 37, within which about 19 times in 20 the 
true frequency will be found.^ To attain practical certainty, we 
may multiply the error by three, givmg chances of about 369 to 1 
that the true frequency is enclosed between 28 ± 3(4 5), or 
between 14.5 and 41 5. Usually, a range of twice the standard 


1 See Appendix Table 1. 



SAMPLING AND SAMPLING ERRORS 


237 


error is regarded as safe enough. In case this range, here 
19 to 37, seems too wide to be of much value, and it is wanted 
to narrow it, the size of the sample must necessarily be increased, 
since the size of the samphng error varies directly as \/n (see 
Sec. 6). 

Evidently, if the size of the sample is decreased, the relative 
range of the sampling error increases, so that one reason why a 
small sample is not suitable for estimating the value of a param- 
eter is easily seen. For example, suppose that the number of 
persons in the sample of Table 67 is only 10, and the frequency 
in the age class 15-24 is 3. Substituting in formula (113), with 
a factor n/(n 1) = ^ inserted as a correction for the small 
size of the sample, 

«s = V¥[3(1 - A)], 

«3 = 1.53. 


We no longer have confidence in the sample frequency as an 
estimate of the universe frequency for use in the formula, but 
disregarding this, we find the range of twice the standard error 
to be 3 ± 2(1.53) = —0 06 to 6 06, or approximately 0 to 6. 
The ratio of twice the error to the frequency is now 3 06/3 == 1.02, 
as compared with = 0 32 for the larger sample. 

If it IS known that a sample was taken under Poisson condi- 
tions from a stratified universe, the standard error of a frequency 
estimated from it may be obtained by the formula 

6/2 = npq - (114) 

where pj is the proportion of successes in any stratum, j, of the 
umverse; p is the mean of the p/s; is the variance of the 

Pj’s: 



and h is the number of the strata. As in the case of formula 
(113), if the umverse values of these statistics are not known, they 
are commonly estimated from the sample, provided the fre- 
quency in each stratum of the sample is fairly large (say, 50 or 
more). 
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Table 68. — One Hundred TJnemploted Persons in New York City, 
1930, Classified by Color and Nativity 



Nimiber of unemployed 

Age, years 

Total 

Native white 

Foreign-born 

white 

Negro 



(1) 

(2) 

(3) 

15-24 

28 

5 

18 

5 

25-34 

23 

8 

13 

3 

35-44 

21 

9 

7 

5 

45-54 

16 

9 

1 

3 

55-64 . 

9 

8 

0 

3 

65 and over. 

3 

3 

0 

0 

Total 

100 

42 

39 

19 


Assume that Table 68 above is a Poisson sample, drawn as 
previously described, the strata being native white {j — 1), 
foreign-born white 0 = 2), and Negro (j = 3), as shown in the 
table. Let = 42, = 39, and ns = 19; and let the numbers 

in each stratum falhng in the age class 15-24 be /i == 5, /2 = 18, 
and/s == 5, giving = 0.12, ps = if = 0.46, and 

Ps = = 0 26. 

Then p = = 0.28, g=l — p—1 — 0 28 = 0 72, and from 

formula (115) 

[42(.12)2 + 39(46)2 + 19(.26)2]/100 - (.28)2 o.023. 

Substituting in formula (114), 

e/2 = 100(28) (72) - 100(.023), 
e/2 = 17 86, 
e/ = 4 23. 

Notice that this error is slightly smaller than that found on the 
assumption that Table 67 represented a simple sample. 

It may be objected that in Table 68 the frequencies in cols. (1), 
(2), and (3) are not large enough to yield very good estimates 
of the true values in the universe. 

6. The Standard Error of a Proportion . — In dealing with Table 
67, above, as a simple sample we may think of the frequency 28 
in the age class 15-24 as a proportion of the total frequency 100, 
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p = - 3 ^ = 0.28, and use formula (116) to find the standard 
error of this proportion: 


Substituting in (116), 



€ 28 = 0 045. 


Therefore the proportion of unemployed persons in the age 
class 15-24 estimated from the sample and the range of error 
of the true proportion may be written 0 28 ± 0.045. This means 
that the chances are two to one that the true proportion, or 
parameter, is not less than 0 235 or more than 0.325. 

If we suppose, as we did in the preceding section, that Table 68 
is a Poisson sample, the formula for the standard error of a 
proportion is 



a - III. 

n n 


(117) 


Using the same values as for formula (114), we have, 

,_(28)(72) .023 s 

100 100 ’ 

6 28=" = 0 001786, 

€ 28 ~ 0.0423, 


which is again smaller than the standard error of the same pro- 
portion estimated from Table 67 regarded as a simple sample 
c The Standard Error of an Arithmetic Mean , — Even when the 
universe departs considerably from normahty, the means of large 
samples tend themselves to be normally distributed. 

Formula (118) gives the standard error of the arithmetic mean 
found from a simple sample. 


€jkf 



(118) 


^ Just as a frequency is changed to a proportional frequency by dividing 
it by Uj so the standard error of the former [formula (113)] is changed into 
the standard error of the latter [formula (116)] m the same way: 




pq 


1 
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where N is the total frequency of the table or the sample, and a is 
the standard deviation of the umverse, estimated from the 
sample. 

The mean of the simple sample of Table 67, taking the mid- 
point of the open interval at 70, is 


M = A + i- 


■Zjd 

If’ 


M = AO + = 36 4. 


The standard deviation is 



<r = 14 2 


Substituting these values in formula (118), 

14 2 

= 1 42 

We therefore write for the mean and its standard error 


36 4 ± 1 42. 


For a Poisson sample, the standard error of the mean is given 
by the formula 


€m 


2 



(119) 


where cr^ is the variance of the universe estimated from the total 
sample, and 


k 



( 120 ) 


where m, is the mean of the jth stratum As usual, all statistics 
are estimated from the sample when the true values in the 
universe are not known 

Referring to Table 68, we found that 

0-2 = (14 2)2 = 201 64, 
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and = (36 4)^ = 1,324.96. Let the mean age of the native 
whites be mi = 44, of the foreign-born whites m 2 = 27.7, and 
of the Negroes m 3 = 38. We compute 


, _ 42 ( 44)2 + 39(27 7)2 19(38)2 

' 100 


= 61 76. 


1,324.96, 


Substituting in formula (119), 


, _ 201,64 61.76 

100 100 
— 1.18. 


1.40, 


As before, we see that the standard error of the Poisson sample 
is smaller than that of the simple sample. 

The standard error of the mean is most useful in testing the 
sigmficance of the difference between two means, to be treated 
later. 

d. Standard Error of the Standard Deviation — For a simple 
sample drawn from .an approximately normally distributed 
universe, the standard error of a standard deviation is 




( 121 ) 


where cr is the standard deviation of the universe, estimated 
from the sample. 


Table 69 — Scoees of 100 Communities on a Community Obganization 

Test 


Score (X) 

Commu- 

nities 

(f) 

d 

fd 

fd^ 

Accumu- 
lated fre- 
quency 

X -M 

KX - M) 

80-99 

9 

2 

18 

36 

100 

41 4 

372 6 

60-79 

17 

1 

17 

17 

91 

21 4 

363 8 

40-59 

43 

0 



74 

1 4 

60 2 

20-39 

20 

-1 



31 

18 6 


0-19 

11 

-2 

-22 

44 

11 

38 6 

424 6 

Total 

100 


- 7 

117 



1,593 2 


Table 67 is a J-shaped rather than a normal distribution, so 
it does not lend itself to formula ( 121 ) We shall, however, 
nsk applying the formula to Table 69, which is only moderately 
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skewed. The 100 connntinities were taken at random from the 
total of some 300 cities of a given size class in the Umted States, 
the name of each commumty taken bemg replaced before the 
next draw. The sample may, therefore, be regarded as a simple 
sample, representing an mfimte existent universe of cities hke 
the 300 cities reported by the census. 

The standard deviation of Table 69 is 


So that, by formula (121), 


= 216 ^ 
V'(2)(100) 


And we write for the standard deviation and its standard error 
21 6 ± 1.53. 


6. Standard Error of a Variance — ^Assuming as before a simple 
sample from an approximately normal umverse, the vanance, 
0 - 2 , has the standard error. 


€o-2 — 



( 122 ) 


The vanance of Table 69 is (21 6)^ = 466 56, and its standard 
error is 

€cr. = 466.56 = 65.78. 


/. Standard Errors of Sampling from a Limited Umverse — ^A 
great part of the sampling done in social research is from hmited 
rather than from infinite umverses. It has already been seen 
that from a limited universe a random sample can be drawn, but 
a simple sample cannot. AU the formulas for finding the stand- 
ard errors of a simple sample given above, therefore, need a 
correction if the sample is drawn at random from a limited 
universe, f e , if the sample is random but not simple. In the 
case of a mean, frequency, or proportion, the correction consists 
in applying the multiplying factor, — s)/U, to the stand- 
ard error of a simple sample, where U is the number of events 
in the limited universe, and s is the number of events in the 
sample, so that s = n ox N in. our formulas above. It is not 
certain that this correction is applicable to standard errors 
other than those mentioned. 
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two illustra tions will suffice to show how the correction 
factor \/ {U — $)/U is used. In the section dealing with the 
standard error of a frequency for Table 67, we found €23 == 4.5. 
Now if we regard the universe of the unemployed in New York 
City from which this sample of 100 was drawn as a limited 
universe, consisting in the year 1930 of an average of 300,000 
persons, we have 


4 


iU-s) 

u 


4 


(300,000 - 100) 
300,000 


0.9997. 


Multiplying this into the standard error found by assuming an 
infimte umverse we get .9997(4 5) = 4 499, which for aU practical 
purposes is the same as before. This suggests that when the 
limited umverse is quite large there is no need to make the 
correction. 

Suppose from a limited universe of 1,382 divorces granted in a 
certain court in 1939, a random sample of 200 is drawn. From 
Ihis sample the mean legal cost of getting a divorce is found 
to be $136, and the standard deviation $32. If we regard the 
universe as limited, the standard error of $136 is obtained by 
multiplying the corrective factor into formula (118), the standard 
error of the mean of an infinite sample: 

In this case, the correction for a limited universe reduces the 
standard error of the mean over 7.0 per cent. 

g Standard Errors When the Unit of Sam'pling Is a Group of 
Events, or District, and the Standard Error of a Population Rate , — 
When a sample of districts, instead of individual events, is 
taken, the district simply replaces the individual event in the 
appropriate standard error formula. That is, n or N becomes 
the number of districts, rather than the number of events. 
The proper standard error formula to use in any given case 
depends as before on the conditions under which the sample was 
drawn. However, only those standard error formulas are 
appropriate for districts that apply to variables, because, disre- 
garding sampling errors within districts, each district is merely 
one value of a variable, such as a proportion or mean, determined 
by the events within the district. In finding the mean, the 
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variance, and so on, of the district values, it is usually advisable 
to weight the latter by the number of events in the respective 
districts. 


Table 70. — Bieth Bates in 20 Counties op Wisconsin, 1935 


County 

Birthrate = 
(P.) 

Population, 
1930 = (F^) 

Products == 

Squares ~ 

1 

1 

0186 

8,003 

148 856 

2 768,718 

2 

0222 

21,054 

467 399 

10 376,253 

3 

0184 

34,301 

631 138 

11 612,947 

4 

0125 

15,006 

187 575 

2 344,688 

5 

0221 

70,249 

1,552 503 

34 310,314 

6 

0175 

15,330 

268 275 

4 694,813 

7 

0172 

10,233 

176 008 

3 027,331 

8 

0157 

16,848 

264 514 

4 152,864 

9 

0205 

37,342 

765 511 

15 692,976 

10 

0173 

34,165 

591 055 

10 225,243 

11 

0174 

30,503 

530 752 

9 235,088 

12 

0225 

16,781 

377 573 

8 495,381 " 

13 

0171 

112,737 

1,927 803 

32 965,426 

14 

0144 

52,092 

750 125 

10 801,797 

15 

0208 

18,182 

378 186 

7 866,260 

16 

0162 

46,583 

754 645 

12 225,243 

17 

0187 

27,037 

505 592 

9 454,569 

18 

0220 

81,087 

903 914 

19 886,108 

19 

0178 

3,768 

67 070 

1 193,853 

20 = n 

0173 

59,883 

1,035 976 

17 922,383 

Total 


671,184 

12,284 470 

229 252,255 


In Table 70 is a random sample of 20 counties of Wisconsin, 
showmg their birth rates = Z^/7^, where X^ = births) in 
1935. The mean birth rate for the table is 


and the variance is 


71 

. X 12,284 470 

^ ^ ^ 671,184 


= .01830, 


(123) 


O'! 


2 


_ X 

n 

2 1 '. 

- .00000667, 



SO that 


229 252,255 
671,184 



o-p = V-00000667 = .0025826. 

(124) 
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By formula (118), 




0025826 


0005775. 


Or, combining formulas (124) and (118), and adding a term 
due to errors of sampling -within a district, the standard error 
of a population rate is approximately 


X w 

fx 

1 




Thus, for Table 70, 


(125)1 


(.0005775)^ + 

.0000003335 + = -0000003335 

+ .000000026766 = .000000360266, 


0.0006. 


Since we think of the 71 counties of Wisconsin in 1935 as a 


hniited u niverse of birt h rat es^ we should apply the correction^ 
factor, V (71 — 20)/71 = V 7183, giving 


.0006(8475) = 0 000509 


as the final standard error. We therefore write the mean birth 
rate and its standard error: 0.0183 ± 0.00051, or multiplied by 


1 More exactly, the last term is 



Sr, / 


1 

671,184 



229 252,255 
671,184 


^ = 0 000000026756 


When the population is large, however, this term is usually neghgible. 
2 Or, usmg population weights, the correction factor is 



where N is the number of districts in the universe and n is the number of 
districts in the sample. 
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1,000, 18 30 ± 0 51. The chances are 19 in 20 that the birth 
rate per 1,000 for the state as a whole will be enclosed within 
the sample range 18.30 ± 2(.51) = 17.28 to 19.32. As a matter 
of fact, the birth rate for Wisconsin in 1935 was 17.3, almost at 
the bottom limit. This is because the city of Milwaukee, with 
a very low birth rate of 15 0, happened to be left out of the sample 
The birth rate for the state without Milwaukee was 18.09, which 
is well within the estimated range of sampling error. 

This case illustrates one of the dangers of sampling by dis- 
tricts, namely, that the sample may omit a district with an 
extreme value and a very large number of events. This is avoided 
when the events are sampled directly. In the case of birth 
rates and other population rates, samplmg by districts is unavoid- 
able, but counties like Milwaukee should be subdivided into 
several average-sized population districts, each with the given 
Milwaukee birth rate Then the chance of such an omission 
from the sample is lessened 

It was assumed above that all the events in each sample 
district were used to determine the district value Sometimes 
it is necessary to sample the events in sample districts. This 
might be the case if we wanted to study a few hundred farmers^ 
household accounts in a given state. We would probably draw 
a sample of counties, but could not get accounts from all of the 
farmers in a sample county. This random sampling of events 
would increase the samphng error within the districts. For 
example, in formulas (123), (124), and (125), would become 
where is the size of the sample population drawn from district 

on which the birth rate, is calculated. 

6. Control of Sampling Error by Size of Sample. — The number 
of items that a sample should contain to yield a satisfactorily 
accurate picture of the distribution in the universe from which 
it IS taken depends on the number of different kinds or classes of 
items that it is necessary to distinguish (^ e., on the heterogeneity) 
in the umverse, on the relative frequencies of the items in each 
class, and on whether the items are atraUjied or mixed. This may 
be explained by a simple illustration. 

If the universe is limited and consists of two individuals, a 
white and a Negro, who are to be examined for skin color, 
evidently the sample will fall short of giving a proper picture 
of the universe, or of being representativey unless it contains 
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both of the individuals, or all of the items in the universe. 
Should the umverse contain thousands of individuals but only 
two skin colors, each equally distributed among the population 
and subject to no vanation in shade from one individual to 
another, a perfectly representative sample need still contain 
only one individual of a color, each drawn at random from a 
color group or stratum. If the same universe is not stratified, 
however, but the sample has to be drawn at random from the two 
races mixed together, a sample of more than two individuals 
should be taken, since otherwise the chance of gettmg all indi- 
viduals of the same color is one in four [(|- + = (t + i) + i]. 

In fact, a fairly large sample — say not less than 25 items — ^is 
now advisable, to lessen the risk that one of the colors will 
appear much more often than the other, and so give a false 
impression of its relative frequency in the universe. Finally, 
suppose that the universe includes many individuals of the same 
race — say Negro — but the skin color vanes widely among the 
individuals Suppose, further, that we want to learn from a 
sample the relative frequency of occurrence of the various shades 
of skin color, including the extreme shades that the color takes. 
If some shade, say intensely black, exists in only one individual 
per 1,000 in the universe, a random sample containing even 
as many as 100 individuals will fail to include it nine times in 
10 [(1 000 - 0 001) 

If it IS wanted to use the sample merely to estimate the mean 
of the universe distribution, omissions at one part of the scale 
may cancel omissions at another part, so that the size of the 
sample need not be so great. Yet, for a given degree of accuracy 
in the estimate, the size of the sample must be increased as the 
variance, cr^, of the umverse distnbution, also estimated from the 
sample, increases. 

It is theoretically a simple matter to reduce the standard 
error to any desired value by merely increasing the size of the 
sample, N. For this purpose, we have the formula 

- aWi, ’ (126) 

where 



€1 is the value of the original standard error, €2 is the value of the 
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desired standard error, Nxis the size of the original sample, and 
N^is the size of sample needed to reduce ei to € 2 . 

In Sec, 5c, above, we found the mean, 36.4, and its standard 
error, 1.42, from a simple sample of 100 items. What size of 
sample is required to reduce the standard error to one-half its 
present value? According to this requirement, €2 = €i/2, so 
that a == 2. Substitutmg in formula (126), 

N 2 = (2)2(100) = 400. 

Notice that when the divisor of the original standard error is 
named, we have only to multiply the size of the sample by the 
square of that divisor. That is, to divide the standard error by 
two, we multiply the size of the sample by four. This rule 
applies to any common standard error except the standard 
error of a frequency. The easiest way to deal with the standard 
error of a frequency is to substitute for it the standard error 
of the equivalent proportion, for which the above rule holds. 
For example, in Sec. 56, above, the frequency, 28, in Table 67 
was changed to the proportion, 0.28, for which the standard 
error was estimated to be 0 045. To reduce this error, or the 
relative error of the corresponding frequency, to one-third of 
its value, we multiply == 100 by (3) 2 = 9, giving 900 as the 
size of the sample required. 

The problem of determining the proper size of fairly large 
samples may be approached in terms of confidence or fiducial 
limits. That is, we may require the sample to be of such sl size 
that about 2P times in 100 the value of a parameter in which we 
are interested will be enclosed within a specified range. Using 
again the example of Sec, 5c, let us say that it is wanted to take a 
sample of such size that the chances are about 95 m 100 that the 
parameter will be enclosed withm a range that extends on either 
side of 5 a distance equal to 10 per cent of the value of >8 We 
then require 

(128) 

where, for this particular problem, S = ilf* = 36.4, 

Ciif. = = 14.2 / a/Vj, 

a: is a mean deviate of jS, p' is one-half the width of the range 

expressed as a percentage of the value of S, - is the value read 

<r 

from a table of normal areas corresponding to one-half of the area 
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enclosed by the specified confidence limits, P = .95/2 = 0.475, 
and N 2 is the required size of the sample. Assuming that the dis- 
tribution of sample means from large samples is approximately 
normal in form, we turn to Appendix Table 1, and find that 
2P = 2(.475) = 0 95 between the points x/(r = +1 96, so that 

xla — 1.96. Substituting in formula (128), — ^ • - = p'X, we 

ViV’2 


To check this, we write 


(1.96) = .10(36 4), 

VFa = 7.64, 

As = 58.4. 


± - (;)' 


36 4 + 3.64. 


Since 3.64 is 10 per cent of 36 4, we have the result desired. 

Notice, as a further check, th at in our solution the standard 
error of the mean is only 14 2/ '\/58.4 = 1.86. If the mean age 


Y 



3276 3640 4004 

Fig 53 — Showing 95 per cent confidence limits for the mean of a random 
sample of 58 items About mnety-five chances in a hundred the true mean will 
be enclosed within the limits 32 76 and 40 04 


varies by 10 per cent of its value, however, it will vary by ±3.64 
years, which is 3 64/1.86 = 1.96€jif. But ordinates of the normal 
curve at the pomts ± 1 96 standard errors include 95 per cent of 
the area of the curve. Therefore, the chances are about 95 in 100 
that the true mean will be found within ± 10 per cent of the value 
of the mean of our sample. 
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7, Error in Mean vs. Individual Predictions from a Regression 
Equation. — Interest often centers in predicting averages rather 
than individual values from a regression equation ^ Thus, out 
of 20 counties with birth rates of 18 per 1,000, how far does the 
mean observed death rate differ from the most probable death 
rate predicted by the regression equation The scatter of the 
observed means of such samples around the predicted value 
in this case depends upon the size of the sample, iV", as well as 
upon the value of r, and may be found from the equation 


ef = 




(130) 


It appears from this that the scatter in predicting the mean 
value of Y corresponding to a given value of X, or to the mid- 
pomt of an X class interval, is reduced, compared to the scatter 
in calculating any individual value of F, in proportion to the 
square root of the size of the sample from w^hich the mean is^ 
found. For the data of Table 50 of Chap X, if we take a 
sample of 20 counties all with approximately the same birth 
rate, equation (130) gives 


et = 


186 


0 416, 


which IS less than a quarter of the size of the standard error of 
estimate, Sy 186), that governs the prediction of a death 
rate from a birth rate in the case of a single county 

8. Representativeness of a Sample. — The test of the goodness 
of a sample is simply the test of its representativeness. If we 
knew the value of the parameter, we could measure the repre- 
sentativeness of the sample in terms of the percentage deviation 
of the statistic from the parameter Thus, if s is the statistic 
and S is the corresponding parameter, the formula for measuring 
the representativeness, is 


Rp = 


100 - 100 


per cent, 


(131) 


where we take S — s li S > s. 

The value of the parameter is seldom known, Kowever, for if 
it were, it is not likely that a sample would be taken. This 
1 See Chap X, Tables 50 and 51. 
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means that there is generally no direct way of measuring the 
representativeness of a sample The nearest approximation 
would be to take several additional samples, each equivalent in 
method of drawing and (preferably) in size to the original sample. 
Then, in addition to noting the variation of certain statistics 
from one of these samples to another, we might pool the samples 
to obtain an average statistic, average the statistics found from 
them, and substitute this average value for S in formula (131), 
above. But if this were done, we would of course at once 
abandon the statistic from the original sample in favor of the 
average statistic from the several samples, whose representative- 
ness would still be unknown. As a rule, therefore, the best that 
we can do in the way of formulating an index of representative- 
ness is to rely on a large sample, and, where possible, stratification 
of the universe, and measure the probable maximum deviation 
of the statistic from the parameter in terms of, say, two standard 
errors of the statistic (e^) This permits us to say that the 

probable minimum representativeness, Rp, of the statistic is 

Rp = ^100 — per cent. (132) 

In Sec 5c, above, we found the mean age of a simple sample of 
100 ages to be 36 4 years, and the standard error of this mean to 
be 1.42 years If we knew that the mean age in the universe 
from which the sample was taken was 37 5 years, we would find 
the representativeness of the sample by formula (131) to be 


Rp 


‘loo - 

100 - 2 9 
97.1 per cent. 


But if we did not know the parameter value, we would estimate 
the probable mimmum representativeness by formula (132), 

@.[l00-2O0(^) 

= 100-7 8 
= 92 2 per cent 


An indirect but important method of judging the representa- 
tiveness of a sample makes use of the circumstance that although 
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the value of the attribute or variable for which we are sampling 
is not known in the universe, other universe values may be 
known; and if the sample can be shown to be representative of the 
latter, it is hkely to be representative of the former also. As an 
illustration of this, we may draw a random sample of families in 
an Alabama county for the purpose of determining by field visits 
the percentage whose annual mcome falls below a certain mini- 
mum level. After the sample is obtained, it may be compared 
with the figures of the latest Federal census for the given county in 
regard to median size of family, the proportions of famihes having 
different numbers of children under 10 years of age, the per- 
centage of families that do not own their home, and the median 
rental paid. If a reasonably close agreement is found between 
the sample and the census population in these respects, the 
sample may usually be regarded as satisfactory also for the study 
of incomes. 


Exercises 

1 . Define in both time and space (1) an infinite universe, (2) a limited 
universe, (3) a hypothetical umverse, choosing in each case a umverse 
of interest to social scientists. 

2 . Give an example of a umverse of social attributes, and define the 
event and the success ’’ 

3. Illustrate a umverse of a social variable. 

4. Draw a sample of events or values from an actual known social 
universe, so that the sample will be (1) random, (2) simple, (3) Poisson 
(stratified) 

5. Draw a random sample of districts from a known social universe 
of your own choosing. 

6. In Table 34 of Chap. VIII, what is the standard error of the 
frequency in the class X = 0^ What does it 

7. In Table 34 of Chap VIII, what is the standard error of the pro- 
portion of prisoners with no previous arrests? How does this standard 
error compare with that for a frequency found in Exercise 6 above 

8. Ten thousand marriage certificates issued in the same month in 
five large American cities are taken as the umverse, and a random 
sample of 500 certificates is drawn from them. After one year, it is 
found that 78 of the 500 marnages are divorced. What is the mean 
probabihty of divorce in this heterogeneous universe of marriages, 
and what is its approximate standard error? 

3, Judging from the sample in the following table, what is a range 
within which the true number of Orientals immigrating to the Umted 
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Sample of 740 Chinese and Japanese Immigrants to the United States, 
BY Year of Arrival 


Year i 

Chmese 

Japanese 

Total 

1929 

102 

65 

167 

1928 

115 

49 

164 

1927 

105 

43 

148 

1925 and 1926 

187 

74 

261 

Total 

509 

231 

i 740 


States in the year 1929, expressed as a percentage of the total Oriental 
immigration over the five-year period, 1925-1929, will fall 95 times out 
of 100? Compare the standard errors found on the assumptions of a 
simple sample from an infinite universe, a random sample from a 
limited umverse (total Chinese immigrants, 5,090, total Japanese 
immigrants, 2,314), and a Poisson sample from a hmited umverse. If 
the samphng was random and proportional between Chinese and 
Japanese, which of these assumptions seems preferable, and why? 

10. What is the standard error of the mean in the table of Exercise 3 
of Chap. XIII for urban famihes'i’ How do you interpret it? 

11 . Below are given the number of children xmder 5 years of age and 
the number of women aged 15-45 years for each of 20 random coimties 
in Wisconsin m 1930, with the resulting fertility ratios. 

Within what range wiU the fertihty ratio for the state of Wisconsin 
fall, 95 times out of 100? (Note* the fertihty ratio in the State of 


Fertility Ratios and Basic Data, 20 Random Counties in Wisconsin, 

1930* 


County 

code 

Children under 

5 ^ X, 

Women 15-45 = 
Y, 

Y, 

1 

731 

1,523 

48 

2 

1,968 

4,331 

45 

3 

3,463 

7,084 

.49 

4 

1,243 

2,723 

.46 

5 

6,998 

16,408 

.43 

6 

1,562 

3,157 

.49 

7 

953 

1,924 


8 

1,619 

3,526 

.46 

9 

3,707 

8,118 

.46 

10 

3,330 

6,765 

.49 

11 

2,536 

6,166 

.41 

12 

1,745 

3,339 

.52 

13 

10,016 

27,401 

.37 

14 

4,504 

10,889 

.41 

15 

1,757 

3,671 

.48 


♦From Fifteenth Census of the United States, 1930, Population, Vol III, Part 2, pp 
1314-1319 





254 


JSLBMBNTABY SOCIAL STATISTICS 


Febtility Ratios anb Basic Data, 20 Random Counties in Wisconsin, 
1930. * — {Continued) 


County 

code 

Children imder 

5 « X. 

Women 15-45 = 

II 

16 

3,707 

10,546 

35 

17 

2,796 

5,550 

50 

18 

3,758 ' 

9,692 

39 

19 

395 

648 

61 

20 

5,364 

13,330 

40 

Total . 

62,152 

146,791 

i _iL_ 


♦From Fifteenth, Census of the TJmted States, 1930, Population, Vol. Ill, Part 2, pp 
131«319 


Wisconsin as a whole in 1930 was about 0.41. There are 71 counties 
in the state.) 

12. Within what range will the standard deviation of the fertihty 
ratios in the universe fall 95 times in 100, according to the random 
sample in the table of Exercise 11 above ^ Can the standard error of 
the standard deviation be applied to urban famihes in the table of 
Exercise 3 of Chap YIII? Explain. 

13. What size sample of rural nonfarm famihes in the table of Exer- 
cise 3 of Chap. XIII is needed to reduce the standard error to one- 
half its value? 

14. In the table of Exercise 9 above, what size sample of Japanese is 
required to confine the true value of the proportion of immigrants in 
the year 1929 within 5 per cent of the observed value 99 times in 100 
{% e., within 99 per cent confidence hmits) ? 

16. Measure the probable minimum representativeness of the mean 
score in Table 69, above. 
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CHAPTER XIII 


THE SIGNIFICANCE OF DIFFERENCES 

1 . The Meaning of Tests of Significance. — It has been seen 
that the value of a statistic estimated from a random sample usu- 
ally differs somewhat from the true value, or parameter, in the 
universe from which the sample is drawn. Similarly, the values 
of a statistic, such as the mean, yielded by two or more random 
samples from the same universe, will almost never be exactly 
the same, and may sometimes be quite far apart. Such varia- 
tions, however, are due merely to chance errors of sampling 
and imply no actual differences. On the other hand, samples 
from different universes yield statistics of different values which 
represent real differences in the parameters. It therefore becomes 
a matter of great importance in investigations based on sampling 
to distinguish between real differences and accidental ones. 

If we could be certain that two or more samples were taken 
at random from the same universe or from different universes, 
there would, of course, be no problem. In most of the practical 
sampling work done in the social sciences, however, the investi- 
gator cannot feel entirely confident that his samples are random, 
and he knows so little about the universes from which they are 
taken that he cannot say whether these universes are essentially 
the same or different. For example, if we try to take random 
samples of 500 persons each from the total population of a city 
hke Chicago, it will not be easy to insure that the selection will 
be random, or even to guarantee that the persons drawn will all 
belong to the population of Chicago. If the several samples are 
not taken on the same day or even at the same hour, the popula- 
tions sampled may be radically different, because of the traffic in 
and out of the city in the mornings and evenings, on week ends 
and holidays. As a consequence of such uncertainties, an 
investigator feels the need for some kind of test that will lend 
additional security to any inferences that he may draw from 
samples The development of such tests, based on the mathe- 
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matical theory of probability, constitutes the major part of 
present-day statistical method. 

In Chap. XII, it was usually assumed that if the sampling was 
random or simple from a normally distributed universe, the 
statistic itself would be normally distributed over many samples. 
By the use of the standard error, it then became possible to 
estimate from the normal curve the probabihty that the param- 
eter would be enclosed within specified limits. It is now neces- 
sary only to extend these ideas to the differences between sta- 
tistics, and to direct attention to the common rule that if a 
difference as large as the one observed might occur by chance 
no oftener than five times in 100, it is regarded as a real 
difference. In that case, the difference is said to be significant. 

The differences that are tested by this method are of two general 
kinds The first is the difference between the value of a statistic 
and the value of a known or h 5 rpothetical parameter. For 
example, can a group of mothers with a mean age of 27 years be a 
random sample from a universe of mothers whose mean age is 
24 years? Or, can a correlation coefficient, r = .34, be a random 
statistic from a universe in which there is no correlation, i.e., 
where r = 0? The second kind of difference that is frequently 
dealt with is the difference between the statistics from two or 
more samples. Thus, can two groups of mothers, one with a 
mean age of 27 years and the other with a mean age of 31 years, 
be random samples from the same um verse? If the test shows 
that the difference is sigmficant, it is inferred that the answer 
to the above questions is negative, on the grounds that a negative 
answer is highly probable. If the test fails to show a significant 
difference, the sample is regarded as a random sample from a 
given universe, or two samples are regarded as random samples 
from the same universe, until tests apphed to larger samples show 
the contrary. If it is not positively known that the samples are 
random, a nonsignificant test at least allows us to say that the 
observed differences are no greater than might occur with random 
samples. 

If a difference is defined as real when the probability of its 
occurring by chance is as low as five in 100, we are said to be 
using the 5 per cent level of significance. The fixing of this critical 
probabihty is arbitrary and a matter of convention. The 5 per 
cent level is rather widely used at present, but the 1 per cent 



THE SIGNIFICANCE OF DIFFERENCES 257 

level is preferred when there is need to be more conservative. 
Eeference to Appendix Table 1 will show that 5 per cent of the 
area of the normal curve lies beyond ordinates erected at about 
two standard errors on each side of the mean, while 1 per cent 
of the area falls beyond ordinates at plus and minus 2.58 standard 
errors (see Fig. 54) It was formerly the practice to insist that 
an observed difference must fall as far out as three standard 
errors. At that point the probability is only about 27 in 10,000 
that so large a difference might occur by chance in either direc- 
tion. This is too stringent for ordinary purposes, because it 
causes the investigator to withhold judgment in an unnecessarily 
large proportion of cases. 


r 



Fig. 64 — Five per cent of the area of the normal curve taken at the positive end 
only, and divided equally between the positive and negative ends 

Notice that since the 5 per cent level of sigmficance, for 
example, includes 2 5 per cent of the area of the normal curve 
at each end of the X scale, it implies that the probability of 
getting ezther a positive or a negative difference is sought. If 
it is desired to find the probabihty of getting say a positive 
difference only, the reading is limited to the positive end of the 
scale (see Fig. 54). 

2. The Significance of a Correlation Coefficient. — Suppose we 
have a correlation coefficient r = ,34 from a simple sample of 
30 pairs of variates from normal universes, one variate being 
scores on an I.Q. test and the other the scores of the same indi- 
viduals on a personality test. Is the value of the observed r here 
so small that it might occur as a random error in a sample from a 
universe in which there is no correlation? 

To answer this question, we test the difference of the observed 
value of r from zero. Appendix Table 4 has been designed to 
provide a ready-made test of this sort in the case of the correla- 
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tion coefficient. From it the value of the coefficient that is 
just significantly different from zero at the 5 per cent (or the 
1 per cent) level of significance may be read off at once, and 
compared with the observed value. The table is entered with 
N — 2 degrees of freedom, which in this case_ are 30 — 2 = 28. 
At the 5 per cent level we find that an r = .36 is just significant. 
Since the value of our r (± .34) is slightly smaller than this, it 
might occur by chance a little oftener than five times in 100. 
If we are governed strictly by the 5 per cent critenon, there- 
fore, we cannot accept an r == .34 as significantly different 
from zero. 

For simple samples from a normal universe so large that N is 
not covered by Appendix Table 4, formula (133) is convenient 
to test the hypothesis that the observed value of r is not different 
from zero. 


1 

In the example above, 


€r = 


1 

V30 


= 0.18. 


(133) 


The ratio of the statistic r to its standard error, called the 
critical ratio (C.R.), is 

C.B. = ^ = 1-89. 


Entering a table of normal areas (Appendix Table 1) with 
<7.22. = x/(T = 1.89, the probability is found to be about six 
in 100 that a larger value of r than that observed might occur 
because of random errors of samphng Agam we find that the 
value r = .34 is not quite sigmficantly different from zero. It 
might have come from a universe in which there was no correla- 
tion at all. 

3. The Significance of gi and — In a problem in Chap. IX 
we found the measure of skewness of a certain distribution 
to be ^1 = 1 17. The standard - error of gi m large samples is 
approximately 



^1 = 


(134) 
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Substituting in formula (134), 

= 0.154. 


If now we divide the value of gri by its standard error, we get the 
critical ratio 1.17/0 154 = 7 6. Since this is much more than 
two standard errors, chance is ruled out, and we conclude that the 
distribution could not reasonably be regarded as a random sample 
from a normal universe. 

Let us next test the value of the measure of kurtosis, ga == 0.86, 
found for the same distribution in Chap. IX. The standard 
error of for large samples is 


For N = 252, 



(135) 


The critic)al ratio is therefore 0 86/0.309 = 2.78. Thus the 
peaked distribution in question could not have been drawn at 
random from a normally distributed umverse ^ 

4. The Significance of the Difference between Any Two 
Statistics. — The vanance of the differences, D, between n paired 
values of two variables, Xi and X 2 , is, by the usual formula, 

2(2) - MnY 

0 -^ — , 

where Mb is the mean of the differences. Or, since 

r. V V J 7i.r 2I> SCXi - Z 2 ) 

D = Xi — X 2 , and Mb = ~! 

N 

_ S[(Xi - Mxi) - (X 2 - M^ 2 )V 
N 


^ For a more exact mterpretation of the critical ratio see L. H. C. 

Tippett, The Methods of StaUshcSf 2iid. ed , page 86 
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Letting Xi — Mxi = xi, and X 2 — Mxt = 


5 2(a:i — 3 : 2 )=“ 




S(a:i^ — 2a:ia;2 + * 2 ®) 

N 

Sxi* 2Sa:iX2 , Sx2^ 


~ N ^ N 

5= cTi^ + o-g^ — 2cricr2 


N 

IfXiX2 


' Nai(T2 

By formula (81), ^xix^/Ncrio-z = 7 * 12 , so that 

<ri>^ = cTi® -J- <ri^ — 2ri2<ri(r2. 

If now we let o’ — e, we have 

€i)^ = CjS ^ _ 2ri2€i€2, 


(136) 


where €1 is the standard error of the statistic in the first sample, 
€2 is the standard error of the corresponding statistic in the 
second sample, and ri 2 is the correlation coefficient between a 
number of sample values of the two statistics. 

Usually, correlation between two sample statistics is purposely 
introduced by drawing one sample at random, and then matching 
on some principle each of the items or values so drawn with an 
item or value from another population. For example, the I Q.’s 
of a random sample of criminals may be matched or paired with 
the I.Q.’s of their brothers. 

If the statistics are the means or proportions from two samples 
whose individual values are matched in some way, the simple 
correlation coefficient, ri 2 , in the case of means, or n (tetrachoric 
correlation coefficient) in the case of proportions, may be used 
to determine the amount of correlation between the paired 
items of the two samples. Where correlation is believed to 
exist between two samples, but it is not known what items are 
paired or on what principle the correlation depends, it is often 
difficult to find the value of ri 2 . 

When there is no correlation between the two samples, i.e., 
when both of the samples are random or simple and so are 
independent, ri 2 = 0, and formula (136) reduces to 


== €1^ + 


(137) 



TEE SIGNIFICANCE OF DIFFERENCES 261 


Formulas (136) and (137) make no assumptions in addition 
to those involved in finding ei and € 2 . 

6. The Difference between Two Means. — simpler formula 
than (136) for testing the difference between the means of two 
matched samples is 


€d 



(138) 


where aa is the standard deviation of the differences between 
the paired values, and N is the number of pairs. The aa is 
estimated from the usual formula for the standard deviation. 
Formula (138) assumes that the experimental sample (i.e., the 
random sample that receives the ^^treatment’^^ is a simple 
sample. 

The scores of a simple sample of brothers and their sisters on a 
personality test are shown in Table 71. Are the means of the 
two series significantly different? Correlation is evidently pres- 
ent between brother and sister, so it is necessary to calculate the 
correlation between them or else to use formula (138) We shall 
do both, for comparison. For Table 71 we have, by formula (74), 

N'LXY - SXSF 

’’ - (SZ)2][(iVS72 - (SF)2f 

40(1215) - 212(203) 

^ V'[40(l,320) - (212)2][40(1,231) - (203)2]’ 

r = .70. 

Also, 

- - = A/(iFW- 

<Td = 1.73. 


Now, the standard error squared of the mean of the Z’s is, by 
formula (118) of Chap. XII, 





4.91, 


» See Chaps. Ill and IV. 
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Table 71 — Person alitt Test Scoees op 40 Pairs op Brothers and 
Sisters. (Hypothetical Data) 


Brother 

(X) 

! 

Sister 

(X) 

XF 

X2 

Yi 

d 


8 

5 

40 

64 

25 

3 

9 

3 

4 

12 

9 

16 

-1 

1 

8 

7 

56 

64 

49 

1 

1 

2 

2 

4 i 

4 

4 

0 

0 

2 

3 i 

6 

4 

1 

9 

-1 

1 

4 

1 

7 1 

28 

16 

49 

-3 

9 

6 

5 

30 

36 

25 

1 

1 

5 

3 

15 

25 

9 

2 

4 

7 

9 

63 

49 

81 

-2 

4 

10 

9 

90 

100 

81 

1 

1 

10 

8 

80 

100 

64 

2 

4 

1 

3 

3 

1 

9 

-2 

4 

9 

7 

63 

81 

49 

2 

4 

8 

6 

48 

64 

36 

2 

4 

7 

5 

35 

49 

25 

2 

4 

7 

8 

56 

49 

64 

~1 

1 

5 

4 

20 

25 

16 

1 

1 

5 

6 

30 

25 

36 

-1 I 

1 

5 

5 

25 

25 

25 

0 

0 

4 

6 

24 

16 

36 

~2 

4 

4 

4 

16 

16 

16 

1 0 

0 

4 

3 

12 

16 

9 

1 

1 

4 

1 

4 

16 

1 

3 

9 

3 

3 

9 

9 

9 

0 

0 

3 

2 

6 

9 

4 

1 

1 

3 

5 

15 

9 

25 

~2 

4 

3 

4 

12 

9 

16 

-1 

1 

5 

7 

35 

25 

49 

-2 

4 

8 

10 

80 

64 

100 

-2 

4 

6 

6 

30 

36 

25 

1 

1 

6 

4 

24 

36 

16 

2 

4 

7 

7 

49 

49 

49 

0 

0 

7 

^ I 

28 

49 

16 

3 

9 

5 

8 ! 

40 

25 

64 

-3 

9 

4 

1 

4 

16 

1 

1 3 

9 

2 

2 

4 

4 

4 

0 

0 

6 

5 1 

30 

36 

25 

1 

1 

4 

3 

12 

16 

9 

1 

1 

7 

6 

42 

49 

36 

1 

1 

5 

7 

35 

25 

49 

-2 

4 

212 

203 

1,215 

1,320 

1,231 

9 

1 

121 
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Similarly, for the F’s, 
= 

(Tr^ = 

— 



5 02, 


Substituting in formula (136), 


Now, 


ei>2 = 0 12 - 2(0 7)(0 35) (0 36) + 0 13, 
62)!* = 0 07, or «23 = 0 27. 


M. 

Mr 

C.R* 


sz 

N 

sr 

N 


212 

40 

203 

40 


= 5.30, 
= 5.08, 


5 30 - 5 08 
0 27 


= 0 81 


The critical ratio is much less than two, so the difference between 
the two means is not significant. 

Substituting next in formula (138), 


= 

V40 


0 27, 


which quickly gives the same value obtained by the longer 
method. 

The meaning of this result is that there is no more difference 
between the scores of brothers and sisters on a personality test 
than might be attributed to random errors of samphng 

Suppose we had neglected the correlation between the data 
of Table 71, and used formula (137) to test the significance of the 
difference between the two means. How much would the result 
have been changed? We have, from formula (137), 


and 


e:,* = 0.12 + 0 13 = 0 25 


C.R. 


5 30 - 5 08 


0 44. 


* In testing differences, the critical ratio (C,R ) is the ratio of the differ- 
ence to the standard error of the difference. 
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The correction for dependence thus almost doubled the critical 
ratio, although in this instance it did not change the verdict 
regarding the significance of the difference. 

When the hypothesis to be tested is that two simple samples 
were drawn from the same umverse, the best estimate of any 
parameter is found by pooling the two samples For example, 
if we are testing the difference between the means of two samples, 
formula (137) becomes 


2 0-2 cr2 

r.+r. 



(139) 


where Ni\s the number of cases in the first sample, N%is the 
number of cases in the second sample, and is found from the 
two samples combmed by the equation 


Ni N2 

, 2 + X 

N1 + N2 


(140)1 


r 

where Xi is any value of the variate in the first sample, Mxi is 
the mean of the first sample, X 2 is any value of the variate m the 
second sample, and ikfx* is the mean of the second sample. 

Table 72, below, gives the scores of 75 communities on a 
community organization test, the sample of communities being 


Table 72 — Scokes op 75 Communities on a Community Organization 

Test 


Score (X) 

Communities 

(/) 

d 

fd 

fd^ 

80-99 

7 

2 

14 

28 

60-79 

15 

1 

15 

15 

40-59 

29 

0 

0 

0 

20-39 

13 

-1 

-13 

13 

0-19 

11 

-2 

-22 

44 

Total 

75 


- 6 i 

100 


simple and independent of the sample of 100 commumties in 
Table 69 of Chap. XII. The mean score of Table 72 is 48 4, 
and its standard deviation is 23 Let us test the hypothesis 


1 This formula gives simply a weighted mean of the two variances, <ri* and 
<rz\ and should not be confused with formula (29) of Chap. VIII, which gives 
the variance of combmed distributions. 
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that these two tables represent simple samples from the same 
universe. The samples being independent by defimtion, we 
require formula (139). The variance of the two samples 
combined is found by formula (140), expressed in frequency 
form: 


t 2 = 


r Ni 


Ni \ 2 

fldi ] Ni 


/m \2 


Nx 


N2 


i2oy 


117 - 


Nx + N^ 

(-7)^ I inn 
"W + ” ~ 75 ~ 


(140a) 


100 + 75 


cT^ = 493.78. 


Substituting this value in formula (139), 

= 493 7S(jU + t^) = H 52, 
€i) = 3 394. 


We therefore have 


C.B 


(Mx^ - Mxi) _ 48 6 - 48 4 
€d 3 39 


0 0589. 


Evidently, the data of the two tables might well represent 
simple samples from the same umverse, as far as their mean 
values are concerned 

If it IS believed that two simple samples are from different 
universes, and it is wanted to test whether the difference between 
their means falls within the range of chance error so that it 
might sometimes be obliterated by sampling error, formula (137) 
takes the form 


Applying this formula to Tables 69 and 72, we get 


466 56 . 530 77 


100 

€25 — 3 43, 


+ 


75 


1174, 


which is slightly larger than the standard error obtained on the 
assumption that the samples are from the same universe. Since 
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the critical ratio is only 


C.R. = 


we interpret it to mean that if there is a real difference between 
the universes from which the two samples came, it may easily be 
reduced to zero or nearly zero m random samples. 

Sometimes two simple samples are taken from the same 
universe, and the mean of sample 1 is tested against the mean 
of the two samples combined. Correlation is thereby introduced, 
and the appropriate formula is then 


€i>2 


Ni{Ni + N^y 


(142) 


where is found from the two samples combined, using formula 
(140) or formula (140a). This formula leads to the same critical 
ratio as formula (139). To show this, let us test the mean score 
(48.4) of the 75 communities in Table 72 against the mean, 
score (48.51) of Table 72 and Table 69 of Chap. XII combined, 
on the theory that the two samples together give a better picture 
of the universe of communities from which the samples were 
drawn than does either one alone. Substituting in formula (142) 
the values previously found, 


, _ 493.78(100) 
75(75 + 100)’ 
= 3 76, 

€j[> = 1 9396, 
rp _ (48.51 - 48.4) 


0.0589, 


which is identical with the result previously obtained. 

6. The Difference between Two Proportions. — ^Although we 
have so far dealt only with the differences between means of 
samples, the same types of formula hold for other statistics. 
For example, if we are testing the difference between two pro- 
portions, the formulas corresponding to formulas (139), (140), 
(141), and (142) are, in order: 

Two simple samples from the same universe, 

ejo^ — pq ( 

^^\ni nt) 


( 143 ) 



where 
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nipi + 712^2 


ni + 712 


Two simple samples from different universes, 
. 2 4, P2g2 

CD = 1 


Two simple samples from the same universe, the proportion of 
successes in the first tested against the mean proportion of 
successes in the two combined, 

— (146) 

{Til + ^2)^1 

Can samples 1 and 2 of Table 73^ below, be simple samples 
from the same universe, in the proportion of families having 
seven or more members? Appljdng formula (143), we need the 
value of p from formula (144) : 

^^(132)(^) + (134)(xIx) .op. 

^ 132 + 134 


Whence 


q = 1.0000 - .0865 = .9135, 

= (.0865) (.9135) (xiif + xk) = .001188, 
ec = 0.0345, 

C.R. = = 1.13. 


Table 73. — ^Two Samples of Families, Classified by Numbek of 

Membeks 


Members in family 

Sample 1 
(fi) 

Sample 2 
(fO 

Total 

1- 2 

56 

66 

122 

3- 4 

40 

42 

82 

5- 6 

22 

17 

39 

7- 8 

6 

5 

11\ 

9-10 

4 

1 

5/ 

11-12 

2 

2 

4> 

13-14 

0 

1 

1\ 

15-16 

2 

0 

2/ 

Total . . 

132 

134 

266 
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From Appendix Table 1, we find a probability of about 26 in 100 
that a difference greater than that observed between the pro- 
portions in the two samples might occur by sampling error, under 
the conditions assumed. If now we use the alternative formula 
(146), we get _ 


, _ ( 0866) (.9135) (134) _ 
(132 + 134)132 
ez, = 0.0174, 

r 14 23 \ 

ri 26 6^ 1 IQ 

,0174 “ 


.000302, 


Notice again that this is the same critical ratio as that obtained 
just above by the use of formula (143). 

7. The Difference between Two Correlation Coefficients. — To 
test the significance of the difference between two correlation 
coefficients from simple samples,^ the variates being normally 
distributed and independent, it is necessary to convert ri and r 2 
to Zi and Z 2 , respectively. This is readily done by means of 
Appendix Table 5. The standard error of z is then found from 
the formula 


f _ 1 


(147) 


Suppose for the correlation between the linguistic ability and 
leadership scores of a group of children, we find ri = .50 from 
sample 1, and r 2 = .60 from sample 2, where iVi == iV ’2 = 50. 
Is the difference between the two fs significant, or is it merely 
an accident of sampling? From Appendix Table 5, we find for 
ri = .50, zi = .549, and for r 2 == .60, Z 2 = .693. By formula 
(147) we calculate the standard error of z, 


1 

Veo - 3 


= 0.146. 


Hence the standard error of the difference 22 — 2 i is, by formula 
(137), 

= (.146)2 + (.146)2 = 0.0426. 

So that 


n R = -^9^ ~ .549 


0.699. 


2 Of any size. 
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Since this critical ratio is well under two standard errors, we 
infer that the difference between the two r’s is not significant. 

8. Testing the Significance of a Sum. — The basic formulas 
(136) and (137) are the same for the sum as for the difference 
between two statistics, except that for the sum all the signs 
of formula (136) are positive. 

9. Testing the Hypothesis of Simple Sampling.— Suppose we 

ask if Table 72 above can be a simple sample from a uni- 
verse of communities in^ which the mean score is 40. We have 
eu = = 23/'s/75 = 2.66, so that 

C.R. = (48.4 - 40.0)/2.66 = 3.16. 

Since the critical ratio is greater than two standard errors, it is 
not likely that the sample is a simple sample from a universe of 
communities whose mean score is 40. There are several possible 
explanations: (1) Table 72 may be a simple sample with an 
extreme mean that might rarely be drawn by chance from the 
given umverse; (2) it may be a sample from the given universe, 
but not taken as a simple sample should have been; or (3) it 
may not be a sample from the given umverse at all. There is no 
way to determine which of these possibihties is correct, unless 
it can be learned how the sample was actually taken. 

The purpose of testing the difference between the two 
means in Sec 5, above, might have been to discover whether 
or not they could occur in two simple samples from the same 
universe. The very low critical ratio of 0 0589 suggests an 
aflSrmative answer, but it cannot completely establish the fact. 
For example, the low critical ratio might be due to the small 
size of the samples or to the presence of correlation, or it might 
be an accident not connected with random sampling. 

The same test can, of course, be employed to determine 
whether a sample might be random or Poisson, by merely using 
the standard error formula appropriate in each case. 

10. The Significance of the Difference between Two or More 
Frequency Distributions. — A. more complete test of the hypothe- 
sis that two or more samples are simple samples from the same 
universe is possible by the Chi-square method, which goes 
beyond the comparison of single statistics (e.p., means) to the 
comparison of whole distributions. In Table 73 we have two 

*<r found from formula (118), Chap. XII. 
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samples of families distributed according to number of members: 
We can use either formula (148) or formula (149) to find Chi 
square (x^) Formula (148) is applicable only to the case of 
two samples, when the total for each row is not wanted, but is 
quicker than formula (149). We shall apply it here. 


where 



(148) 

(149) 


rfi is the frequency in row r of col 1, nc is the total of any column, 
c, ni is the total of col. 1, is the total of any row r, fo is any 
observed frequency, ft is any theoretical or expected frequency, 
and N is the total of the whole table. For Table 73 we have 

V — ^ 1 ’ “ 0 496, 
g = 1 - 0.496 = 0.504, 
iPi “ == 0 459, 

2 P 1 = 'll- = 0 488, 
sPi — — 0.564, 

4 P 1 = ^ = 0 609. 

To get 4 P 1 , the frequencies of the last five rows were combined, in 
accordance with the rule that no cell should contain less than 
five expected frequencies. Substituting in formula (148), 

X* = • ;4 -9 6(.504j [56(.459) + 40(.488) + 22( 564) + 14( 609) 

- ]32(.496)], 

X® = 2 76. 

Entering a table of Chi square (Appendix Table 2) with 


r — 1 = 4 — 1 = 3 degrees of freedom 
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(i e,, one less than the number of rows, counting the five com- 
bined rows as one), we find a probabihty P between .30 and .50 
that the differences between the two samples might be due to 
errors of simple sampling from the same universe The test 
therefore fails to show that the two sample distributions of 
families differ significantly in number of members 
The Chi-square test may also be used to investigate whether 
a sample is a simple sample from a known universe, if the distri- 
bution of the umverse is known. The umverse distribution, 
with N equated to that of the sample, simply takes the place 
of one of the samples in Table 73, 

To test whether more than two samples are from the same 
universe, it is necessary to find Chi square by formula (149), 
Its application to three samples in Table 74 is shown below. 


Table 74. — ^Three Samples of Families, Classified by Number of 

Members 


Members m 
family 

Sample 1 

Sample 2 

Sample 3 

Total 

fo 




■ 



ft 1 

(/. - 


j— 1 

ft 

1- 2 

56 

56 34 

00205 

66 

67 20 

1 35385 

53 

! 

61 46 

1 16452 

175 

3- 4 

40 

40 57 

00801 

42 

41 18 

01633 

44 

44 25 

00141 

126 

5- 6 

22 

21 57 

00857 

17 

21 90 

1 09635 

28 

23 53 

84917 

67 

7- 8 


13 52 

01704 

5 

13 73 

1 62949 

9 

14 75 

1 22457 

20\ 

9-10 

r 



1 



3 



8| 

11-12 

<2 



2 



5 



4 

13-14 

)o 



1 



1 



2( 

15-16 

\2 



0 


1 

1 



3/ 

Total 

132 

132 01 

i 

0 03567 

134 

134 00 

4 09602 

144 

143 99 

3 23967 

410 


The expected frequency, ft, in any cell is found, as explained in 
Chap. IX, by dividing the table total into the product of the 
row and column totals For example, the expected frequency 
in the class interval 3-4, sample 2, is 126(134)/410 == 41.18; 
in the class interval 5-6, sample 3, it is 67(144)/410*= 23.53; 
and so on The last five rows are combined, because four of 
them have fewer than five expected frequencies. After com- 
bining, the expected frequencies (132)(42)/410 = 13.52. By 
formula (149), 


= 7.37136. 
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The degrees of freedom are 

(c ~ l)(r - 1) = (3 - 1)(4 - 1) = 6, 

where c is the number of columns, and r is the number of rows, 
counting all combmed rows as one. Entering a table of with 
six degrees of freedom, we see that x^ = 12.59 for P = 05. 
That is, a value of x^ ^ large as 12 59 may be expected by chance 
five times in 100. Smaller values of x^ will, of course, occur 
more often by chance. Since our value of x^ is only 7.37136, it 
cannot be regarded as sigmficant The test therefore furnishes 
no evidence that our three samples are not simple samples from 
the same universe. 

11. The Significance of the Difference between Statistics 
from More than Two Samples. — In testing the significance of 
the differences between statistics (e g , means) from three or more 
samples, the probabihty of finding a significant difference by 
chance is greater than in the case of only one difference, just as 
the probability of getting an ace at cards is greater when we dravr 
twice from the deck than when we draw only once. 

A formula that takes this into account is the following: 

where d is the difference between any two independent statistics, 
is its standard error, and n is the number of differences. In 


Table 75 — Six Samples op 50 Juvenile Delinquents Each, and Six 
C oNTEOL Samples, Showing Peecen pages Neueotic 


Samples 

Percentage 

delinquents 

neurotic 

Percentage 

nondelm- 

quents 

neurotic 

(2. 


k 

u 

1 and la 

4 

6 

-2 

4 4 

-0 45 

2 and 2a 

10 

4 

6 

5 1 

1 18 

3 and 3a 

2 

4 

~2 

3 4 

-0 59 

4 and 4a 

0 

2 

-2 

2 0 

-1 00 

6 and 5a 

4 

8 

~4 

4 7 

-0 85 

6 and 6a . 

6 

2 

4 

3 9 

1 03 

Total 





-0 68 
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Table 75 we have six simple samples of juvenile delinquents and 
six simple samples of nondelinquents, each containing 50 boys. 
Using formula (143) to find the standard error of the six differ- 
ences, we have, for the first pair of samples m the table, 

The standard errors of the other differences are found similarly, 
and entered in Table 75, Substituting from the last column 
of the table in formula (150), 

C.B. = ^^^(-0 68) = -0.25. 

From a table of normal areas (Appendix Table 1), it appears 
that a positive or negative critical ratio greater than this might 
occur by chance over 80 times in 100 trials, so that there is no 
evidence that the samples of delinquents differ significantly 
from the samples of nondelinquents in respect to the percentages 
neurotic. 

Formula (150) is applicable to any set of independent critical 
ratios, including those from random or Poisson samples, if 
random or Poisson standard error formulas are used to find the 
values of €». 


Exercises 

1. Correlate the birth rates of Table 70 of Chap. XII with the fer- 
tihty ratios of the same sample counties m the table of Exercise 11 in 
Chap XII, and test whether the correlation coefficient is sigmficantly 
greater than zero. 

2. In the table of Exercise 3, below, test the hypothesis that rural 
farm and rural nonfarm families (1) are from the same universe in 
respect to mean size of family; (2) are from different umverses, but their 
means might sometimes be approximately equal as a result of samphng 
error. 

3. In the table below test the assumption that urban families might 
be a random sample from a universe which is best represented by the 
three samples combined. Use the mean as the criterion. 
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Sample of 2,992 Families bt Size, fob Urban and Rubal Areas, United 

States, 1930 


Members in 
family 

Urban 

families 

Rural 

farm 

famihes 

Rural 

nonfarm 

famihes 

Total 

1 

140 

34 

62 

236 

2 

436 

121 

141 

698 

3 

384 

119 

120 

623 

4 

315 

110 

99 

524 

5 

202 

88 

68 

358 

6 

118 

66 

43 

227 

7 

66 

47 

27 

140 

8 

37 

32 

16 

85 

9 

20 

20 

9 

49 

10 . . 

10 

12 

5 

27 

11. 

5 

6 

2 

13 

12 or more* . . . 

4 

6 

2 

12 

Sample total , 

1,737 

661 

594 

2,992 

Universe total 

17,400,000 

6,600,000 

6,900,000 

29,900,000 


* Count as 13 


4. Bo the matched delinquents and nondehnquents in the sample 
below differ sigmficantly in mean I.Q. 

Intelligence Quotients of a Eandom Sample of 25 Male Juvenile 
Delinquents and 25 Male Nondelinquents, Matched by Age, 
Family Income, and Place of Residence 


Pair number 

Intelligence quotient 

Delmquent 

Nondelmquent 

1 

103 

99 

2 

80 

. 92 

3 

114 

106 

4 

100 

104 

5 

91 

88 

6 

73 

80 

7 

105 

109 

8 

98 

94 

9 

86 

90 

10 

101 

97 

11 ; 

92 

89 

12 

86 1 

91 

13 

93 

90 

14 

90 

97 

15 

79 

84 
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Intelligence Quotients of a Random Sample of 25 Male Juvenile 
Delinquents and 25 Male Nondelinquents, Matched by Age, 
Family Income, and Place of Residence — {Continued) 


Pair number 

Intelligence quotient 

Delmquent 

Nondehnquent 

16 

108 

96 

17 

82 

: 91 

18 

95 

86 

19 

74 

83 

20 

102 

97 

21 

105 

99 

22 

97 

103 

23 

88 

91 

24 

94 

84 

25 

99 

106 

Total 

2,335 

2,346 


5* Combining rural farm and rural nonfarm families in the table of 
Exercise 3, above, is there a sigmficant difference between the mean 
size of family in urban and rural areas according to this sample taken 
proportionally at random from the two types of areas'? 

6. Apply Chi-square to the table of Exercise 3 above to test (1) the 
hypothesis that rural farm and rural nonfarm famihes are simple samples 
from the same universe; (2) the hypothesis that urban, rural farm, and 
rural nonfarm famihes are simple samples from the same umverse. 

7. In the table of Exercise 3 above, test the hypothesis that the means 
of the three simple samples are from the same umverse. 

8. Test the hypothesis that famihes of odd sizes and families of even 
sizes in the table of Exercise 3 above are Poisson samples from the 
same stratified universe 

9. Test the hypothesis that the value of a selected statistic from each 
of the samples drawn in Exercise 4 of Chap. XII does not differ sig- 
mficantly from the known value of the corresponding parameter in the 
umverse. 

10. Test the hypothesis that the value of a selected statistic from the 
sample drawn in Exercise 5 of Chap. XII does not differ sigmficantly 
from the known value of the corresponding parameter in the umverse. 

References 

Same as for Chap. XII. 



CHAPTER XIV 
TIME SERIES ANALYSIS 


Values of a variable {e g,, infant death rates) given at successive 
intervals of time {e.g , yearly) form a time series. Such series 
are especially important in economics and are also necessary in 
the study of vital statistics,^ of trends in public expenditures for 
relief, and many other topics. This chapter describes methods 
for their analysis. 

As an illustrative problem, let us inquire what the state of 
Wisconsin has accomplished in reducing infant mortality. 
Figures giving deaths per 1,000 live births from 1908 through 
1935 are shown as a time series in Table 76. 


Table 76 — ^Wisconsin Infant Moetalitt Rates, 1908-1935* 


Year 

Infant deaths 
per 1,000 live 
births 

Year 

Infant deaths 
per 1,000 live 
births 

1908 

107 

1922 

70 

1909 

120 

1923 

70 

1910 

109 

1924 

64 

1911 

103 

1925 

67 

1912 

95 

1926 

67 

1913 

97 

1927 

60 

1914 i 

83 

1928 

61 

1915 

78 

1929 

60 

1916 

86 

1930 

56 

1917 

78 

1931 

53 

1918 

79 

1932 

50 

1919 

79 

1933 

48 

1920 

77 

1934 

49 

1921 

72 

1935 

46 


* Report of the Wteconstn Bureau of Vital Statistics, 1934-1935, p 284 


The first step that is usually taken in time series analysis is to 
plot the data. This is done for Table 76 in the lower graph of 

^ Birth rate, death rates, marriage rates, and so on. ^ 
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Fig. 55. Examination of this figure shows a striking decline in 
iofant mortality in Wisconsin over a 28-year period. 



Yia 66. — ^Infant mortality rates for Wisoonsin and for the United States, 1908- 
1935 (From Tables 76, 77, and 80.) 


1, The Secular Trend: A Straight Line.— Suppose, further, 
that we want to compare the infant mortahty record in Wisconsin 
with that of other states in the United States. Data for the 
original birth registration area of 10 states and the District of 
Columbia^ are available for the period 1915 through 1933, and 
are entered in Table 77. They are plotted as a dotted line in 
Fig. 55. It is seen, from this figure, that infant mortality has 
been less in Wisconsin than in the original registration area 

1 Connecticut, Maine, Massachusetts, Michigan, New Hampshire, New 
York, Pennsylvania, Rhode Island, Vermont, and the District of Columbia. 
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tlirougliout the entire period of comparison. But has the rate 
of decline since 1915 been greater in Wisconsin or in the 
origmal registration area? 


Table 77 — Inpam Mobtalitt m the Original Birth Begistration 
Area op the United States, 1915-1933* 


Year 

Infant deaths 
per 1,000 live 
births 

Year 

Infant deaths 
per 1,000 live 
births 

1915 

100 

1925 

74 

1916 

100 

1926 

75 

1917 

96 

1927 

64 

1918 

106 

1928 

67 

1919 

89 

1929 

65 

1920 

90 

1930 

62 

1921 

79 

1931 

60 

1922 

79 

1932 

55 

1923 

79 

1933 

53 

1924 

72 




* From B%Tth, SiHlitrth, and Infant Mortality Statistics, 1933, p 7, U S Bureau of the 
Census. 


The answer is to determine which of the two series has the steeper 
slope. Inspection shows that both graphs are irregular and 
saw-toothed m shape, so that the slope sometimes of one and 
sometimes of the other is the steeper. What we must do is to 
remove the irregularities in the two series, ix , reduce them to 
smooth curves. To do this is to find the secular trend, meaning 
the general direction of the series over a considerable period of 
time, freed from confusing oscillations. To answer our par- 
ticular question in the present case, it seems appropriate to fit 
straight hues to the data, since more complex smooth curves do 
not describe the dechning death rates any better. 

We have already learned to fit a straight line by the device 
of least squares, in determining the regression equation in linear 
correlation. The normal equations for finding the values of a 
and b in the line of best fit are 


^ XxY 
Src*’ 


a = My, • 


(151) 

(162) 
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where a; is a deviate from the midyear” of an odd^ number of years, 
Y is the infant death rate, and the origin is at the midyear or 
mean of the X^s. The values of a and b so found are substituted 
in the equation of the straight line. 

Fc = a + hx. (153) 

We set up Table 78 to obtain these values for the Wisconsin 
series, and Table 79 for the registration area series. 


Table 78 — Fitting a Straight Line to the Wisconsin Data of Table 76 


Year 

1 

Year 

(a^i) 

Infant death rate 
(F) 

xiY 


1915 

-9 

78 

-702 

81 

1916 

-8 

86 

-688 

64 

1917 

-7 

78 

-546 

49 

1918 

-6 

79 

-474 

1 36 

1919 

-5 

79 

-395 

25 

1920 

-4 

77 

-308 

1 16 

1921 

-3 

72 

-216 

9 

1922 

-2 

70 

-140 

4 

1923 

-1 

70 

- 70 

1 

1924 

0 

64 

0 

0 

1925 

1 

67 

67 

1 

1926 

2 

67 

134 

4 

1927 

3 

60 

180 

9 

1928 

4 

61 

244 

1 16 

1929 

5 

60 

300 

25 

1930 

6 

56 

336 

36 

1931 

7 

53 

371 

1 49 

1932 

8 

50 

400 

1 64 

1933 

9 

48 

432 

1 SI 



SF = 1,275 
= 67 1 

'SxiY « -1,075 

Sa;i2 =: 570 


From Table 78, 

bi = ^=1^ = -1.886,’ 

1 If tbe series includes an even number of years, one of them may be 
dropped to give an odd number, so tbat tbe convenient short formulas (151) 
and (152) may be used, instead of the more laborious normal equations for a 
straight hne given in Chap. X. / 
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and 

ai = 67.1, 

so that, approximately, 

7i - 67 ~ 1.89x1 (153a) 

is the equation of the straight-line secular trend fitted to infant 
mortality rates in Wisconsin with origin at 1924. 

Similarly, from Table 79, we find the equation of the straight- 
line trend through infant mortality rates in the original registra- 
tion area of the United States to be approximately 


Fs = 77 - 2.750^2. (1536) 


Table 79 — Fitting a Straight Line to the Original !Registrat?ion 
Area Data of Table 77 


Year 

Year 

Infant death rate 
(F) 

X2Y 


1915 

-9 

100 

-900 

(see Table 78) ^ 

1916 

-8 

100 

-800 


1917 

-7 

96 

-672 


1918 

-6 

106 

-636 


1919 


89 

-445 


1920 


90 i 

-360 


1921 


79 ! 

-237 


1922 


79 

-158 


1923 


79 

- 79 


1924 

0 

72 

0 


1925 

1 

74 

74 


1926 

2 

75 

150 


1927 

3 

64 

192 


1928 

4 

67 

268 


1929 

5 

65 

325 


1930 

6 

62 

372 


1931 

7 

60 

420 


1932 

8 

55 

440 


1933 

9 

53 

447 




SF = 1,465 

My - 77.1 

S:C2F == -1,569 

= 670 


We are now in a position to answer the question, Does the 
trend line for Wisconsin or that for the original registration area 
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have the steeper slope? We see that the slope of the Wisconsin 
hne is 6i = — 1.89, whereas the slope of the original registration 
area hne is h% == —2.75. The negative signs mean that as x 
increases, i c , as time passes, F, the infant death rate, decreases. 
Evidently, the trend of infant mortahty has been decreasing 


2 75 
189 


= 1.5 times as fast in the original registration area as in 


Wisconsin. The two lines of trend are plotted in Fig. 55 by 
substituting appropriate values of x in equations (153o) and 
(153?>). For example, the ordinate of the line through the 
Wisconsin data is, if = —9, Fi = 67 — 1.89 (— 9) == 84; and 
if rci = 9, Fi = 67 — 1.89 (9) = 50; so that the line is drawn 
through the points (—9, 84) and (9, 50). 

In terms of percentages, the infant mortality rate dechned an 
average of 2.13 per cent per year in Wisconsin, as compared with 
2 56 per cent in the total registration area. 

2. The Secular Trend : A Moving Average. — It is an important 
principle that any hne or curve used to represent the secular 
trend of a series should be rather simple in form — a straight 
hne if that is at all reasonable, otherwise seldom an 3 rfching more 
complex than a second degree parabola (F = a + bX + cX^). 
The reasons are that a trend line that follows the original data 
too closely includes cyclical variations from which the secular 
trend should be freed, and it also fails to fulfill the primary 
purpose of a trend hne, which is to show clearly the general 
direction, up or down, in which the series is moving 

Of course, a straight hne may be a very poor fit for some 
senes, so that if we want to generalize the trend without doing 
too much violence to the data we may need to fit another kind of 
curve, say, a parabola. Although the formulas differ, the general 
principles are the same. 

A second method of determining secular trend, which usually 
allows the trend hne to follow the original data more closely 
than a straight hne does, should be explained. This is the 
method of the moving average, which is shown in Table 80. 
It IS again preferable to average an odd number of years, because 
the results can then be more conveniently centered at a given 
year in the senes. If cycles appear in the original series, the 
length of the moving average should be equal to the average 
period of a cycle from peak to peak, or some multiple thereof, if 
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the purpose is to represent the secular trend. But if the moving 
average is used only to smooth out random fluctuations, its 
length should be less than that of an average cycle period. 
The shorter the period of the moving average, the more flexible 
is the resulting curve. Inspection of Fig 55 suggests the pres- 
ence of possible cycles of about seven years in length in both 
series. Accordingly, moving averages of seven years are shown 
in Table 80. 


Tajble 80 — ^Seven'-ybar Moving Averages of Infant Mortality Rates 
IN Wisconsin and in the Original Registration Area of the 
United States, 1915-1933 


Year 

Mortality rates 

Seven-year moving averages 

Wisconsin 

Registration 

area 

Wisconsin 

Registration 

area 

1915 

78 

100 



1916 

86 

100 


ir 

1917 ' 

78 

96 



1918 

79 

106 

78 

94 

1919 

79 1 

89 

77 

91 

1920 

77 

90 

75 

88 

1921 

72 

79 

73 

85 

1922 

70 

79 

71 

80 

1923 

70 

79 

70 

78 

1924 

64 

72 

67 

75 

1925 

67 

74 

66 

73 

1926 

67 

75 

64 

71 

1927 

60 

64 

62 

68 

1928 

61 

67 

61 

67 

1929 

60 

65 

58 

64 

1930 

56 

62 

55 

61 

1931 

53 

60 



1932 

50 

55 



1933 

48 

53 




The method is simply to add the first seven values of the series, 
and divide by 7. Thus, for the Wisconsm series, we have 
(78 + 86 + 78 + 79 + 79 + 77 + 72 = 549)^ = 78 4. Then 
the first value m the table, 78, is dropped, and the eighth value, 
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70, is added, and again the sum is divided by 7: 

(549 - 78 + 70)4- = = 77.3; 

and so on. 

Notice that a disadvantage of the moving average is that it 
reduces the length of the series by one less than the number of 
years averaged, or in this case 7 — 1 = 6 years. When the 
moving averages are plotted as large dots in Fig. 55, it is 
seen that they give trend lines that agree very closely with the 
straight lines of best fit, especially in the case of the Wisconsin 
data. 

It is helpful in selecting a secular trend to note that “if the 
actual data fall consistently above or below a line of trend for a 
considerable period, it is probable that the fit is not good.”^ 
This is not the case in Fig. 55. 

3. Short-term Cycles. — The cycles in the Wisconsin and 
registration area series may be shown more clearly than in Fig 55 



Fig 66 — Infant mortality rates in Wisconsin and tlie original registration 
area of the United States, 1915-1933. cychcal deviations from linear trends. 
(From Table 82.) 

by expressing the original rates as percentages of the trend, using 
for the latter either the values lying on the straight lines of best 
fit or the moving averages just found. If we choose the former, 
the results are shown in the last two columns of Table 81. Thus, 
from Table 80, for 1915, we have 78, and from Table 81, 84 01, so 
that 100(78/84 01) = 92 85. Any cyclical tendencies in these 
percentages of trend will stand out even more if we subtract 
100 per cent from each of them, thus expressing them as positive 

1 F. C. Mills, Statistical Methods, p 290, Henry Holt and Company, Inc , 
New York, 1924. 
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and minus deviations. This is done in Table 82/ and the 
resulting cyclical deviations are plotted in Fig. 56. 

From Fig. 56 it appears that only short and erratic cycles 
occur in infant mortality rates in Wisconsin and in the original 

Tabus 81. — Infant Mortauty Rates in Wisconsin and in the Ohiginai, 
Registration Area of the United States, 1915-1933 
Straiglit-luie Trend Values and Observed Values as Percentages of tbe 

Trend Values 


Year 

Lmear trend 
values 

Observed rates as 
per cent of trend 

Wisconsm 

Registration 

area 

Wisconsm 

Registration 

area 

1915 

84.01 

101 75 

92 85 

98 28 

1916 

82 12 

99 00 

104 72 

101 01 

1917 

80 23 

96 25 

97 22 

99 74 

1918 

78.34 

93 50 

100 84 

113 37 

1919 

76.45 1 

90 75 

103 34 

98 07 

1920 

74.56 

88 00 

103 27 

102 27 

1921 

72 67 

85 25 

99.08 

92 67 

1922 

70 78 

82 50 

98 90 

95 76 

1923 

68 89 

79 75 

101 61 

99 06 

1924 

67 00 

77 00 

1 95 52 

93 51 

1925 

65.11 

74 25 

102 90 

99 66 

1926 

63 22 

71 50 

105 98 

104 90 

1927 

61 33 

68 75 

97 83 

93 09 

1928 

59 44 

66 00 

102 62 

101 52 

1929 

57.55 

63 25 

104 26 

102 77 

1930 

55 66 

60 50 

100 61 

102 48 

1931 

53.77 

57 75 

98 57 

103 90 

1932 

51 88 

55 00 

96 38 

100 00 

1933 

49 99 

52 25 

96 02 

101 44 


registration area over the period 1915 through 1933. Slightly 
different results would have been obtained if the moving average 
instead of the straight hne had been used as the index of trend 

1 Notice that the first two columns of Table 82 should each sum to zero 
They fail to do so because we disregarded decimals m the equations of the 
lines of best fit. 
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Table 82 — Infant Mortality Rates in Wisconsin and the Original 
Registration Area of the United States, 1915-1933 
Percentage Deviations from Straight-line Trends 


Year 

Percentage 

from 

Wisconsia 

(*') 

deviations 

trend 

Registra- i 
tion area 
(2/0 



(®V) 

1915 

-7 

15 



1 

72 

51 

12 

2 

96 

12 

30 

1916 

+4 

72 

+ 

1 

01 

22 

28 

1 

02 

4 

77 

1917 

-2 

78 

_ 

0 

26 

7 

73 


07 

0 

72 

1918 

+0 

84 

+13 

37 , 


71 

178 

76 

11 

23 

1919 

+3 

34 

— 

1 

93 

11 

16 

3 

72 

- 6 

45 

1920 

+3 

27 

+ 

2 

27 

10 

69 

5 

15 

7 

42 

1921 

-0 

92 

— 

7 

33 


85 

53 

73 

6 

74 

1922 

-1 

10 

— 

4 

24 

1 

21 

17 

98 

4 

66 

^ 1923 

+1 

61 

— 

0 

94 

2 

59 


88 

- 1 

51 

1924 

-4 

48 

— 

6 

49 

20 

07 

42 

12 

29 

08 

1925 

+2 

90 


0 

34 

8 

41 


,12 

1 

- 0 

99 

1926 

+5 

98 

+ 

4 

90 

35 

76 

24 

01 

29 

30 

1927 

-2 

17 

— 

6 

91 

4 

71 

47 

75 

14 

99 

1928 

+2 

62 

+ 

1 

52 

6 

86 

2 

31 

3 

98 

1929 


26 

+ 

2 

77 

18 

15 

7 

67 

11 

80 

1930 

+0 

61 

+ 

2 

48 


37 

1 6 

15 

1 

51 

1931 

-1 

43 

+ 

3 

90 

2 

04 

15 

21 

- 5 

58 

1932 

-3 

62 


0 

00 

13 

10 

0 

00 

0 

00 

1933 

-3 

98 

+ 

1 

44 

15 

84 

2 

07 

- 5 

73 

Total 

+2 52 

j + 3 50 

233, 

65 

411 

68 

118 26 


To compare the amounts of fluctuation of the two series 
around the line of trend, the percentage deviations of Table 82 
are squared and summed, giving for the Wisconsin senes, 


ax' = 



= 3,51, 


and for the original registration area, 




-4 


411 68 
19 



4.65. 
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We therefore conclude that the original registration area series is 
1.32 times as variable as the Wisconsin series Some of this 
difference is due to the abnormal rates of the war year 1918. 
We would expect such a result, as conditions affecting infant 
health are probably more variable over the whole registration 
area than in the single state of Wisconsin. 

4. Correlation between the Short-term Cycles of Two Time 
Series. — Inspection of Fig 56 shows that infant mortahty rates 
tend to rise and fall together in Wisconsin and in the original 
registration area. This resemblance between the apparently 
erratic fluctuations of the two series may be symptomatic of the 
existence of general factors that produce cycles in infant deaths. 
The point is important enough to test with some care. We may 
ask, just how much relationship is there between the variations 
in infant mortality rates in Wisconsin and in the original registra- 
tion area? To answer this question we need to know the value 
of the coefficient of correlation between the two tune series, 
taking the deviations from the trend lines, as given in Table 82,^ 
instead of from the means of the series. It will be recalled 
that the formula for the Pearsoman coefficient of correlation is 


SrgV — NMs/My , 

NcTx'O'y' 


Taking the sum of the cross products, SxV == 118.26, from 
Table 82, iV = 19 years from 1916 to 1933 inclusive, craf = 3.50, 
and (Tf = 4.65, as found above, we have 


118.26 - 19 


r = 


r = 




2 52\ /3 50\ 
19 A 19 A 


19(3.51)(4.65) 
118 26 - 0.46 


301.11 


7-2 = 0.15. 


So that the relationship between infant mortality rates in 
Wisconsin and in the original registration area from year to 
year enables us to predict one from a knowledge of the other 
only 15 per cent more accurately than if we judged one of the 
series from a knowledge of its own mean and variance. 

Could it be that a correlation coefficient of r = .39 is due to 
random accidental correspondence between the cyclical fluctua- 
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tions in the two series? Although we are dealing here with two 
historical series, we have removed the secular trend, and this is 
sometimes regarded as warrant for applying the standard error 
to this situation. An inspection of the cycles in Fig. 56, how- 
ever, suggests that some correlation between successive years 
still remains, so that we can hardly assume that the death 
rates in our series, regarded as a sample, are independent of one 
another. Under these conditions, the basic assumptions of 
simple sampling underlying the standard error formula which is 
appropriate in this case, viz,, er= I/VN - 1, are violated; 
so we are unable to answer the question asked at the beginning 
of the paragraph. However, the absence of much correlation 
between the Wisconsin rates and the original registration area 
rates suggests that the control of infant mortality is prima- 
rily a local problem. This should be further tested by com- 
paring infant mortality rates in Wisconsin with those in adjoimng 
states. 

It is just as important to avoid the distorting effects of one 
or a few atypical, extreme values in correlating times series as in 
other correlation problems (see Chap. X, Table 49). For exam- 
ple, in Fig. 56 it appears that the war year 1918 was decidedly 
abnormal in its infant mortality rate, and the same is to some 
extent true of the depression year, 1933. In those two years 
there is much less agreement than usual between the two series 
If we are interested primarily in knowing the amount of correla- 
tion between infant death rates in Wisconsin and in the registra- 
tion area in normal years, it is, of course, desirable to omit the 
two atypical years from the computation of the correlation 
coefl&cient. This would make it necessary to fit a new trend 
line to the remaining 17 years of the series, and find the coefficient 
of correlation between the percentage deviations from it. In 
case we do not want to confine the investigation of the amount 
of association between the two series to ^^normaF' years, which 
are not always easy to define objectively, and yet we do want to 
reduce the influence of the extreme or atypical values, it is 
probably advisable to resort to the coefficient of rank correlation. 
This coefficient, p, is calculated from Table 83, and has a value 
of .41. 

, 6SD2 6(678) _ ,, 

NiN^ ™ 1) . 19(19® - 1) ‘ ■ 
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Table 83. — Rank op Observed Rates as Per Cent of Trend 


Year 

Wisconsin 

! Registration 
area 

L> 

D2 

1915 

1 

1 6 

- 5 

25 

1916 

18 

11 

7 

49 

1917 

5 

9 

- 4 

16 

1918 

11 

19 

- 8 

64 

1919 

16 

5 

11 

121 

1920 

15 

14 

1 

1 

1921 

9 

1 

8 

64 

1922 

8 

4 

4 

16 

1923 

12 

7 

5 

25 

1924 

2 

3 

- 1 

1 

1925 

14 

8 

6 

36 

1926 

19 

18 

1 

1 

1927 

6 

2 

4 

16 

1928 

13 

13 

0 

0 

1929 

17 

16 

1 

1 

1930 

10 

15 

5 

25 

1931 

7 

17 

-10 

100 

1932 

4 

10 

- 6 

36 

1933 

3 

12 

9 

81 

Total 




678 


As expected, the result of using ranks in this case is to increase 
the amount of correlation somewhat. 

It often happens that the correlation of two time series is 
greater if one of them is lagged one or more years, so that the 
cycles correspond more closely. For example, if the marriage 
rate dechnes sharply, so does the birth rate, but not until about 
a year later. Therefore, to test the relationship between mar- 
riage and birth rates, the latter should be lagged by one year. 
That is, say, the 1930 birth rate should be paired with the 1929 
marriage rate, etc. There is no indication that a lag is needed 
in correlating the two series with which we were deahng above. 

6. Seasonal Fluctuations. — Data such as infant mortahty 
rates may be obtained by months as well as by years This 
affords an opportunity to study the seasonal fluctuations in 
infant deaths, t.e., the variations in death rates that are associ- 
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Tabuej 84 — ^Infant Mortautt Bates by Months, United States 
Registration Area, 1928 - 1935 * 


Year 

(la) 

Month 

(1&) 

H 

Monthly 

trend 

rates 

(3) 

Observed 
rate as 
per cent 
of trend 
(2)-^(3)X100 

(4) 

Monthly 
averages 
of observed 
rates as 
per cent 
of trend 
(Table 87) 
(5) 

Seasonal 

index 

(6) 

Cycles 

(4)-(6) 

(7) 

' 1928 

Jan 

72 4 

69 24 

104 56 

113 34 

113 35 

- 8 79 


Feb 

73 2 

69 09 

105 95 

112 19 

112 20 

- 6 25 


Mar 

74 8 

68 93 

108 52 

108 55 

108 56 

“ .04 



75 0 

68 78 

109 04 

103 38 

103 39 

-1- 5 65 


May 

70 4 

68 62 

102 59 

97 48 

97.49 

4- 5 10 


June 

64 2 

68 47 

93 76 

93 59 

93.60 

+ 16 


July 

60 8 

68 31 

89 01 

90 09 

90.10 

- 1 09 


Aug 

60 2 

68 16 

88 32 

87 03 

87 04 

+ 1 28 


Sept 

63 4 


93 24 

91 20 

91 21 

+ 2 03 


Oct 

64 3 

67 84 

94 78 

97 09 

97.10 

- 2 32 


Nov 

65 2 

67 69 

96 32 

97 83 

97.84 

- 1 52 


Dec. 

81 3 

67 53 

120 39 

108 07 

108 08 

+12 31 

1929 

Jan 

99 1 

67 39 

147 05 

113 34 

113 35 

+33 70 


Feb 

84 8 

67 22 

126 15 

112 19 

112,20 

+ 13 95 


Mar 

74 3 

67 07 

110 78 

108 55 

108 56 

+ 2 22 


Apr 

66 1 

66 91 

98 79 

103 38 

103 39 

- 4 60 


May 


66 76 

95 72 

97 48 

97 49 

- 1 77 


June 

57 8 

66 60 

86 79 

93 59 

93 60 

- 6 81 


July 

55 7 

66 45 

83 82 

90 09 

90 10 

- 6 28 


Aug 

57 7 

66 29 

87 04 

87 03 

87 04 

0 00 


Sept 

63 4 

66 13 

95 87 

91 20 

91 21 

+ 4 66 


Oct 

64 9 

65 98 

98 36 

97 09 

97 10 

+ 1 26 


Nov 

59 4 

65 82 

90 25 

97 83 

97 84 

- 7 59 


Dec 

65 2 

65 67 

99 28 

108 07 

108 08 

- 8 80 

1930 

Jan 

67 8 

65 51 

103 50 

113 34 

113 35 

- 9 85 


Feb. 

69 8 

65 36 

106 79 

112 19 

112 20 

- 5 41 


Mar 

69 3 

65 20 

106 29 

108 55 

108 56 

- 2 27 


Apr 

68 2 

65 05 

104 84 

103 38 

103 39 

+ 1 45 


May 

62 6 

64 89 

96 32 

97 48 

97 49 

- 1 17 


June 


64 74 

94 84 

93 59 

93 60 

+ 1 24 


July 

59 3 

64 58 

91 82 

90 09 

90 10 

+ 1 72 


Aug 

56 0 

64 43 

86 92 

87 03 

87 04 

- 12 


Sept 

61 7 

64 27 

96 00 

91 20 

91 21 

+ 4 79 


Oct 

67 1 

64 12 

104 65 

97 09 

97 10 

+ 7 55 


Nov 


63 96 

99 28 

97 83 

97 84 

+ 1 44 


Dec 

69 8 

63 81 

109 39 1 

108 07 

108 08 

+ 1 31 

1931 

Jan 

75 3 


118 30 

113 34 

113 35 

+ 4 95 


Feb 

74 6 



112 19 

112 20 1 

+ 5 30 


Mar 

70 4 


111 15 

108 55 

108 56 

+ 2 59 


Apr 

65 7 

63 18 



103 39 

+ 60 


May 

56 4 


89 48 

97 48 

97 49 

- 8 01 


June 

53 5 


84 94 

93 59 

93 60 

- 8 66 


July 

54 1 

62 72 

86 26 

90 09 

90 10 

- 3 84 


Aug 

54 3 


86 80 


87 04 

24 


Sept 

58 6 

62 41 

93 73 

91 20 

91 21 

+ 2 52 


Oct 

61 0 

62 25 

97 99 

97 09 

97 10 

+ 89 


Nov 

58 1 


93 56 

97 83 

97 84 

- 4 28 


Dec 

57 3 

61 94 


108 07 

108 08 

-15 57 



56 5 

61 79 

91 44 

113 34 

113 35 

-21 91 



57 5 

61 63 


112 19 

112 20 

-18 90 



62 8 

61 48 

102 15 

108 55 

108 56 

- 6 41 




61 32 

97 85 

103 38 

103 39 

- 5 54 



57 8 

61 17 

94 49 

97 48 

97 49 




56 1 

61 01 

91 95 

93 59 

93 60 

— 1 65 



55 2 

60 85 

90 71 

90 09 

90 10 

+ 61 



50 9 

60 70 

83 86 

87 03 

87 04 

— 3 18 



49 7 

60 54 

82 09 


91 21 

— 9 12 



52 5 

60 39 

86 93 

i 97 09 

97 10 

-10 17 



60 4 

60 23 


97 83 

97 84 

+ 2 44 


BsSSli 

73 0 

60 08 

1 121 50 


108 08 

+ 13 42 


* From Bxrths, StUlbtrthSt arid Infant Mortality, U. S. Bureau of the Census, annual 
publication. 
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Table 84. — ^Infant Mortality Hates by Months, United States 
Registration Area, 1928-1935 * — {Continued) 




Infant 



Monthly 1 





mortality 


Observed 

averages i 



Year 

Month 

rate per 
1,000 live 

i Monthly 
trend 

rate as 
per cent 

of observed 
rates as 

Seasonal 

mdex 

Cycles 

(4)-(6) 



births in 

rates 

of trend 

per cent 1 



same 


(2)-(3)X100 

of trebd 





month 


(Table 87) 



(la) 

(lb) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

1933 

Jan 

71 2 

59 92 

118 83 

113 34 

113 35 

+ 5 48 


Feb 

69 9 

59 77 

116 95 

112 19 

112 20 

+ 4 75 


Mar. 

60 1 

59 61 

100 82 

108 55 

108 56 

- 7 74 


Apr 

May 

56 3 

59 46 

94 69 

103 38 

103 39 

- 8 70 


54 7 

59 30 

92 24 

97 48 

97 49 

~ 5 25 


June 

56 2 

59 15 

95 01 

93 59 

93 60 

+ 1 41 


July 

51 9 

58 99 

87 98 

90 09 

90 10 

- 2 12 


Aug * 

50 1 

58 84 

85 15 

87 03 

87 04 

- 1 89 


Sept 

54 9 i 

58 68 

93 56 

91 20 

91 21 

4- 2 35 


Oct ! 

58 7 ! 

58 52 

100 31 

97 09 

97 10 

+ 3 21 


Nov 

58 0 i 

58 37 

99 37 

97 83 

97 84 

4- 1 53 


Dec 

58 5 I 

58 21 

100 50 

108 07 

108 08 

- 7 58 

1934 

Jan. 

60 6 

58 06 

104 37 

113 34 

113 35 

- 8 98 


Feb 

66 5 

57 90 j 

114 85 

112 19 

112 20 

+ 2 65 


Mar. 

67 7 1 

57 75 1 

117 23 i 

108 55 

108 56 

+ 8 67 


Apr 

May 

64 9 

57 59 

112 69 

103 38 

103 39 

+ 9 30 


60 8 

57 44 

105 85 

97 48 

97 49 

+ 8 36 


June 

60 2 

57 28 

105 10 

93 59 

93 60 

+ 11 50^ 


July 

58 3 

57 13 

102 05 

90 09 

90 10 

+ 11 95 


Aug 

52 2 

56 97 

91 63 

87 03 

87 04 

+ 4 59 


Sept 

51 6 

56 82 

90 81 

91 20 

91 21 

- 40 


Oct 

57 0 

56 66 

100 60 

97 09 

97 10 

+ 3 50 


Nov 

59 3 

56 51 i 

104 94 

97 83 

97 84 

+ 7 10 


Dec 

63 8 

56 35 1 

113 22 

108 07 

108 08 

+ 5 14 

1935 

Jan 

66 7 

56 19 

118 70 

113 34 

113 35 

+ 5 35 


Feb 

65 0 

56 04 

115 99 

112 19 

112 20 

+ 3 79 


Mar 

62 3 

55 88 

111 49 

108 55 

108 56 

+ 2 93 


Apr 

May 

58 6 

55 73 

105 15 

103 38 

103 39 

+ 1 76 


57 3 

55 57 

103 11 

97 48 

97 49 

+ 5 62 


June 

53 4 

55 42 

96 36 

93 59 

93 60 

+ 2 76 


July 

49 2 

55 26 

89 03 

90 09 

90 10 

+ 1 07 


Aug 

47 7 

55 11 

86 55 

87 03 

87 04 

- 49 


Sept 

46 3 

54 95 

84 26 

91 20 

91 21 

- 6 95 


Oct 

51 0 

54 80 

93 07 

97 09 

97 10 

- 4 03 


Nov 

53 9 

54 64 

98 65 

97 83 

97 84 

+ 81 


Dec 

58 7 

54 49 

107 73 

i 108 07 

108 08 

- 35 


♦From Births, Stillbirths, and Infant Mortality, U S Bureau of the Census, annual 
publication. 


ated with spring, summer, fall, and winter To do this, we must 
first separate the seasonal fluctuations from the secular trend, 
the short-term cycles, and the random fluctuations, all of which 
appear in the original monthly rates given in col. (2) of Table 84. 
We average the 12 monthly rates in each year in Table 84 to 
obtain annual rates, which are entered in Table 85, and plotted 
in Fig. 57. Inspection of Fig. 57 shows a decline in the infant 
mortality rate in five out of seven years, and suggests that a 
straight line probably is most appropriate to represent the 
secular trend. Table 85 shows the calculations needed to fit a 
linear trend to the annual rates by the method of least squares. 
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70 

65 

<D 

*560 


0 

1928 1929 1930 1931 1932 1933 1934 1935 
Year 

Fig. 67 — ^Annual infant mortality rates in the registration area of the United 
States, 1928-1933. (From Table 85 ) 



Table 85 — Values Needed for Fitting a Straight Line to the Annuae 
Infant Mortalitt Rates in the Registration Area of the 
United States, 1928-1935 


Year 

Infant 
death 
rate (F) 

Year 

code 

(ZO 

X'F 

X'* 

Trend 

values 

Deviations 
from trend 

1928 

68 767 

-3 

-206 301 

9 

68 388 

+ 379 

1929 

67 692 

-2 

-135 384 

4 

66 524 

+1 168 

1930 

64 700 

-1 

- 64 700 

i 1 

64 660 

040 

1931 

61 592 

0 

0 000 

0 

62 796 

-1 204 

1932 

57 700 

+1 

54 700 

‘ 1 

60 932 

-3 232 

1933 

58 375 

+2 

116 750 

1 4 

59 068 

- 693 

1934 

60 242 

+3 

180 726 

9 

57 204 

-h3 038 

1935 

55 842 

+4 

223 368 

1 16 

55 340 

4- 502 

Total 

494 910 


169 159 

44 


- 002 

Mean 

61 864 

0 5 






Substituting the values found in Table 85 in the normal equa- 
tions for determining the constants in the equation of a straight 
line, we have 

I ^ 

a — My — bMx^y 
. _ 169.159 - 8(0 5) (61 864) 

® 44 - 8(0 25) ’ 

b = -1.864, 

a = 61 864 - (-1.864)(0.5), 
a = 62.796, 

so that 

F. = a + bX', 

Y, = 62.796 - 1.864Z'. (154) 
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From formula (154) the trend values shown in the next to the 
last column of Table 85 are estimated by substituting for X' its 
successive values taken from the third column of the table. The 
annual trend line is plotted in Fig 57. The last column of 
Table 85, obtained by subtracting the trend values from the 
observed Y values, is inserted as a check on the arithmetic. 
Its sum is approximately zero, as it should be if the calculations 
are carried far enough. 



States, 1928-1935 (From Table 84 ) 


Since in each year the infant death rate declines on the 
average 1.864, in one month the decline is 1.864/12 = 0.1553. In 
Table 85 we used average annual rates, which apply to the 
middle of a year. The middle of the year falls on June 30. 
The average monthly rates, however, apply to the middle of 
each month. We, therefore, enter Table 84 at June, 1928, and 
add to the annual 1928 trend rate of 68 388 one-half of the 
correction factor, 0.1553, so that we have 

68.388 + .0777 = 68.4657 

as the June, 1928, monthly trend in col. (3) of Table 84. We 
then add 0 1553 accumulatively to this rate for the five preceding 
months in 1928, and subtract 0 1553 accumulatively from it for 
each subsequent month throughout the eight-year period, which 
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completes col. (3). The monthly trend line, which is identical 
with the annual trend hne, and the observed monthly rates from 
col (2) of Table 84, are plotted in Fig. 58. From this graph, it is 
seen that in spite of a general downward trend in infant mortality 
rates, these rates have fluctuated considerably, so that even in, 
say, early 1935 they were much higher than in the middle of 1928. 
How much of this variation is due to the season of the year? 


Table 86 — FREQtneNCY Distbibution of Observed Infam Mobtalitt 
Bates Expressed as Percentages op Trend, by Months, United 
States Registration Area, 1928 - 1935 * 


Observed 
rates, per cent 
of trend 

Jan 

Feb 

Mar 

i 

Apr. 

May 

June 

July 

Aug. 

Sept 

Oct. 

Nov 

Dec. 

145-149 

/ 












140-144 













135-139 













130-134 













125-129 


/ 











120-124 


i 










// 

115-119 

/// 

/// 

/ 










110-114 


/ i 

/// 

/ 








/ 

105-109 


// 

// 

// 

/ 

/ 






// 

100-104 

/// 


// 

// 

// 


/ 



/// 

// 

/ 

95- 99 



m 

// 

n 

// 



// 


//// 

/ 

90- 94 1 

/ 

/ 

B 

/ 

B 


// 

/ 

//// 


// 

/ 

85- 89 

■ 


B 

■ 

/ 

/ 

//// 

lAT/ 


/ 



80- 84 

! 

■ 

B 

! 


/ 

/ 

/ 1 

// 





♦ From col (4), Table 84. 


Column (4) of Table 84 shows the monthly observed rates 
expressed as percentages of the monthly secular trend rates. 
These percentages represent the seasonal variations combined 
with the short-term cycles and the random fluctuations, but 
with the secular trend eliminated. To remove the short-term 




























294 


ELEMENTARY SOCIAL STATISTICS 


cycles and random fluctuations, it is necessary to average the 
percentages for each month, over the eight-year period. As an 
aid in revealing whether or not a seasonal movement actually 
exists, and m choosing the most stable kind of monthly average, 
Table 86 is set up A glance at it shows clearly the presence of a 
seasonal pattern in infant mortality. The death rate is high in 
the winter and low in the late summer. From the arrangement 
of the frequencies in the several colum n s, it appears that, except 
possibly in January, the arithmetic mean is a suitable average 
to use in this case. As a rule, however, it is recommended to 



M 0 n f h 

Fig. 69 — Seasonal indexes of infant mortality rates m the registration area of the 
Umted States, 1928-1935. (From Table 87 ) 

average the middle three or four values for each month, a sort 
of combined mean and median average which avoids the dis- 
tortion due to extreme values. In Table 87 the mean monthly 
values are found^ and are entered in col. (5) of Table 84. To 
convert the 12 mean monthly percentages to index numbers, 
they are divided by their own average, 99 99, and the quotient 
multiplied by 100, to give the last row of Table 87 and col (6) 
of Table 84. The index of seasonal variation has an advantage 
over the simple percentages of col. (5) of Table 84, in that they 
vary around a mean of exactly 100.00 per cent, and are therefore 
more generally comparable and finished in form. In the seasonal 
indexes of col (6) of Table 84 there now remains only the seasonal 
variation, since the secular trend, cycles, and random fluctua- 
tions were removed by the steps just taken. An undistorted 
1 From col. (4) of Table 84. 
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idea of the seasonal variation can now be obtained by plotting 
the monthly seasonal indexes around their mean of 100 per cent, 
as in Fig. 59 It is again obvious that the winter months are 
the danger period for infants. 

6. Short-term Cycles Freed from Seasonal Fluctuations. — If 
it is wanted to observe the short-time cycles mixed with random 
fluctuations in the monthly infant mortality rates, freed from 



1928 1929 1930 1931 1932 1933 1934 1935 


Year 

Fig. 60 — Short-term cycles and random fluctuations in mfant mortality rates, 
United States registration area, 1928-1935. (From Table 84 ) 

both the secular trend and the seasonal movement, this may be 
done by recording in col. (7) of Table 84 the differences between 
the percentages of trend in coL (4) and the seasonal indexes in 
col. (6), and plotting them in Fig. 60. It appears that a number 
of other factors besides the season of the year affect the infant 
death rate, and need to be studied and brought under control. 
There is no suggestion from Fig. 60 that any progress was made 
during the eight-year period in reducing the percentage of 
infant deaths due to cychcal and random causes. The point 
might be tested by obtaining the standard deviations around 
zero of the differences in col. (7) of Table 84, for the first two 
years and the last two years of the period, and comparing the 
two standard deviations 


296 ELEMENTARY SOCIAL STATISTICS 


Table 87 — Calcuia-tion op Monthly Means of Infant Mortality 
Hates Expressed as Percentages of Trend, United States 
Registration Area, 1928-1935 


Year 

Jan 

Feb 

Mar 

— i 

Apr 

May 

June 

July 

Aug 

Sept 

Oct 

Nov 

Dec 


1928 

104 

56 

105 

95 

108 

52 

109 

04 

102 

59 

93 

76 

89 

01 

88 

32 

93 

24 

94 

78 

96 

32 

120 

39 


1929 

147 

05 

126 

15 

110 

78 

98 

79 

95 

72 

86 

79 

83 

82 

87 

04 

95 

87 

98 

36 

90 

25 

99 

28 


1930 

103 

50 

106 

79 

106 

29 

104 

84 

86 

32 

94 

84 

91 

82 

86 

92 

96 

00 

104 

65 

99 

28 

109 

39 


1931 

118 

30 

117 

50 

111 

15 

103 

99 

89 

48 

84 

94 

86 

26 

86 

80 

93 

73 

97 

99 

93 

56 

92 

51 


1932 

91 

44 

93 

30 

102 

15 

97 

85 

94 

49 

91 

95 

90 

71 

83 

86 

82 

09 

86 

93 

100 

28 

121 

50 


1933 

118 

83 

116 

95 

100 

82 

94 

69 

92 

24 

95 

01 

87 

98 

85 

15 

93 

56 

100 

31 

99 

37 

100 

50 


1934 

104 

37 

114 

85 

117 

23 

112 

69 

105 

85 

105 

10 

102 

05 

91 

63 

90 

81 

100 

60 

104 

94 

113 

22 


1935 

118 

70 

115 

99 

111 

49 

105 

15 

103 

11 

96 

36 

89 

03 

86 

55 


26 

93 

07 

98 

65 

107 

73 

Tot^ 

1 


75! 

897 

48 

868 

43 

827 

ES 

779 

io 

748 

75 

720 

68 

696 

27 

729 

le 

776 

69 

782 

65 

864 

52 

Mean 


113 

34 

112 

19 


55 


38 

97 

48 

93 

59 

^ 90 

09 

87 


1 91 


97 

09 

97 

83 

108 

07 

Index 

* 

113 

35 

112 

20 


56 


39 

97 

49 

93 

1 


1 

' 87 

1 

1 

21 

97 

10 

97 

84 

108 

08 


Exercises 

!• Compare the trends in the birth rates of cities and of rural areas 
in the original registration area of the United States over the 19-year ^ 
period, 1915 through 1933, using the data in the table below. Show 
the cyclical deviations from trend, compare the vanabihty of the two 
series, and calculate the amount of correlation between the fluctuations 
of the two senes. What should be done with the data for extremely 
atypical years, such as the war year, 1918? Is the correlation improved 
by ^^agging’’ one of the series? Plot all data. 


Birth Rates per 1,000 Population for Cities and Rural Areas in the 
Original Registration Area op the United States, 1915-1933* 


Year 

Birth rate 

Year 

Birth rate 

Cities 

Rural 

Cities 

Rural 

1915 

26 0 

23 8 

1925 

22 2 

20 3 

1916 

26 0 

23 5 

1926 

21 5 

19 1 

1917 

26 4 

23 3 

1927 

21 3 

19 0 

1918 

25 8 

23 0 

1928 

20 5 

18 1 

1919 

23 8 

21.1 

1929 

19 7 

16 8 

1920 

24 6 

22.2 

1930 

19 3 

16 7 

1921 

24 5 

23 1 

1931 

17 8 

16 3 

1922 

22 9 

21 8 

1932 

17 0 

15 5 

1923 

1924 1 

22 9 

23,2 

21.1 

21 2 

1933 

15 8 

14 8 


♦From Bxrth ^ Stillbirth , and Infant Mortality Statistics , 1935, pp. 5-6, Bureau of tlie 
Census, 
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2. For the relief data in the accompanying table show the secular 
trend and the seasonal fluctuations, and plot the results in each case. 


Number of Cases Receiving Relief in 385 Rural and Town Areas of 
THE United States, 1932-1936* 


Month 

Cases 

1932 

1933 

1934 

1935 

1936 

January 

30,931 

99,064 

169,554 

1 298,785 

1 145,734 

February 

32,552 

107,860 

177,041 

299,217 

1 146,697 

March 

34,239 

128,794 

202,551 

290,217 

143,000 

April 

32,965 

121,234 

216,463 

279,901 

131,038 

May 

30,713 

112,079 

222,647 

266,014 

123,102 

June 

30,774 

110,158 

232,331 

244,074 

117,808 

July 

29,687 

131,850 

239,441 

227,814 

120,067 

August 

30,214 

126,572 

259,410 

218,883 

128,303 

September 

33,561 

114,147 

255,929 1 

204,745 

129,124 

October 

38,126 

117,459 

251,397 

201,341 

144,492 

November 

65,922 

135,234 

262,635 

198,780 

149,781 

December 

75,517 

115,877 

282,068 

167,297 

166,173 


♦Adapted from Waller Wynne, Jr , F%ve Years of Rural Relxeft p. 36, WPA, Division of 
Social Research, 1938 
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Appendix 


Table 1 — Area and Ordinate of the Normal Curve ^ 


x/<T 

Area 

Ordinate (y) 

xjtt 

Area 

Ordinate (y) 

00 

.0000000 

3989423 

.46 

1772419 

3588903 

01 

.0039894 

3989223 

!47 

. 1808225 

3572253 

02 

03 

.0079783 

0119666 

3988625 

3987628 

.48 

1843863 

3555325 

04 

0169534 

3986233 

.49 

.1879331 

3538124 

06 

0199388 

3984439 

.50 

.1914625 

3520653 

06 

0239222 

3982248 

.51 

.1949743 

3502919 

.07 

0279032 

3979661 

.52 

.1984682 

3484925 

.08 

.0318814 

3976677 

.53 

2019440 

3466677 

09 

0358564 

3973298 

.54 

2054015 

3448180 

10 

0398278 

.3969525 

.55 

.2088403 

3429439 

11 

0437953 

3965360 

.56 

.2122603 

3410458 

12 

0477584 

3960802 

57 

.2156612 

3391243 

.13 

0517168 

3955854 

.58 

2190427 

3371799 

14 

0566700 

3950517 

59 

2224047 

3352132 

16 

.0596177 

3944793 

.60 

2257469 

3332246 

16 

0635595 

3938684 

.61 

2290691 

1 3312147 

17 

0674949 

3932190 

62 

2323711 

3291840 

18 

0714237 

3925315 

63 

2356527 

3271330 

19 

0753454 

3918060 

.64 

2389137 

3250623 

20 

0792597 

3910427 

.65 

2421539 

3229724 

.21 

0831662 

3902419 

66 

2453731 

3208638 

22 

0870644 

3894038 

67 

2485711 

3187371 

.23 

0909541 

3885286 

68 

2517478 

3165929 

24 

0948349 

3876166 

.69 

2549029 

3144317 

26 

0987063 

3866681 

70 

2580363 

3122539 

26 

1025681 

3856834 

71 

2611479 

3100603 

27 

1064199 

3846627 

72 

2642375 

3078513 

28 

1102612 

3836063 

73 

2673049 

3056274 

29 

1140919 

3825146 

74 

2703500 

3033893 

30 

1179114 

3813878 

.75 

.2733726 

3011374 

31 

1217195 

3802264 

.76 

2763727 

2988724 

.32 

1255158 

3790305 

.77 

2793501 

2965948 

.33 

1293000 

3778007 

.78 

2823046 

.2943050 

,34 

1330717 

3765372 

.79 

.2852361 

2920038 

.36 

1368307 

.3752403 

.80 

.2881446 

2896916 

36 

1406764 

3739106 

.81 

.2910299 

2873689 

37 

1443088 

3725483 

82 

2938919 

2850364 

38 

1480273 

3711539 

.83 

2967306 

2826945 

.39 

1517317 

3697277 

.84 

2995458 

2803438 

,40 

1555417 

3682707 

.85 

.3023375 

,2779849 

41 

1590970 

.3667817 

86 

3051055 

2756182 

.42 

1627573 

3652627 

.87 

3078498 

2732444 

43 

1664022 

3637136 

88 

.3105703 

2708640 

44 

1700314 

3621349 

89 

3132671 

2684774 

46 

1736448 

3605270 

.90 

3159399 

2660852 


1 From Kent, “The Elemente of Statiatica 
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Table 1. — ^Area and Ordinate of the Normal Curye.^* — {Conhnued) 


z/tf 

Area 

Ordinate iy) 

x/ar 

Area 

Ordinate (y) 

91 

3186887 

2636880 

1 36 

4130850 

1582248 

92 

3212136 

.2612863 

1 37 

4146565 

1560797 

.93 

3238145 

2588805 

1 38 

4162067 

1539483 

94 

.3263912 

2564713 

1 39 

4177356 

1518308 

.95 

3289439 

.2640691 

1 40 

.4192433 

1497275 

.96 

.3314724 

2516443 

1 41 

4207302 

1476385 

97 

.3339768 

.2492277 

1 42 

4221962 

1455641 

98 

.3364569 

.2468095 

1 43 

4236415 

1435046 

99 

3389129 

.2443904 

1 44 

4250663 

1414600 

1 00 

3413447 

.2419707 

1 46 

4264707 

1394306 

1 01 

3437624 

2396511 

1 46 

.4278550 

1374165 

1 02 

3461358 

2371320 

1 47 

4292191 

1354181 

1 03 

3484950 

,2347138 

1 48 

4305634 

1334353 

1 04 

3608300 

2322970 

1 49 

4318879 

1314684 

1 06 

.3531409 

.2298821 

1 60 

.4331928 

1295176 

1 06 

3554277 

2274696 

1 61 

4344783 

1275830 

1.07 

.3576903 

2250699 

1 52 

.4357445 

1256646 

1.08 

.3599289 

2226535 

1 53 

4369916 

1237628 

1 09 

3621434 

2202508 

1 64 

4382198 

1218775 

1 10 

3643339 I 

2178522 

1 66 

4394292 

1200090 

1 11 

3666005 

2154582 

1 66 

4406201 

1181573 

1 12 

3686431 

2130691 

1 57 

4417924 

1163225 

1 13 

3707619 

2106856 

1 68 

4429466 

1145048 

1 14 

3728668 

2083078 

1 59 

4440826 

1127042 

1 15 

3749281 

2059363 

1 60 

.4452007 

1109208 

1 16 

3769756 

2035714 

1 61 

4463011 

1091548 

1 17 

3789995 

2012135 

1 62 

4473839 

1074061 

1 18 

3809999 

1988631 

1 63 

4484493 

1056748 

1 19 

3829768 

1965205 

1 64 1 

4494974 j 

1039611 

1 20 

.3849303 

1941861 

1 65 

4505285 

1022649 

1 21 

3868606 

1918602 

1 66 

4515428 

1005864 

1 22 

3887676 

.1896432 

1 67 

.4525403 

,0989255 

1 23 

3906614 

1872354 

1 68 

4535213 

.0972823 

1 24 

3925123 

.1849373 

1 69 

4544860 

0956568 

1 25 

3943602 

.1826491 

1 70 

4554345 

0940491 

1 26 

.3961663 

1803712 

1 71 

4563671 

0924591 

1 27 

3979577 

,1781038 

1 72 

4572838 

0908870 

1 28 

3997274 

1758474 

1 73 

4581849 

0893326 

1 29 

4014747 

1736022 

1 74 

4590705 

0877961 

1 30 

.4031996 

1713686 

1 76 

4599408 

0862773 

1 31 

.4049021 

1691468 

1 ^ 

.4607961 

0847764 

1 32 

4065826 

.1669370 

I 1 77 

.4616364 

0832932 

1 33 

4082409 

.1647397 

1 ^ 

4624620 

0818278 

1 34 

4098773 

1626551 

1 ^ 

4632730 

0803801 

' 1 35 

4114920 

1603833 

1 ^ 

4640697 

0789502 


I IProm Kentr ‘‘The Elements of Statistics.** 
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Table 1. — ^Abea ajstd Ordinate oe the Normal Curvb.^ — {Continued) 


x/cr 

Area 

Ordinate (y) 

x/cr 

Area 

Ordmate 0/) 

1 81 

.4648621 

.0776379 

2.26 

.4880894 

0310319 

1 82 

.4656206 

.0761433 

2 27 

.4883962 

0303370 

1.83 

-4663760 

.0747663 

2 28 

.4886962 

.0296546 

1.84 

-4671159 

.073406S 

2.29 

.4889893 

.0289847 

1.86 

,4678432 

.0720649 

2.30 

-4892759 

.0283270 

1 86 

.4685572 

.0707404 

2 31 

.4895569 

0276816 

1 87 

-4692581 

.0694333 

2 32 

.4898296 

0270481 

1 88 

.4699460 

068143S 

2 33 

.4900969 

0264265 

1.89 

-4706210 

0668711 

2.34 

.4903581 

0258166 

1.90 

-4712834 

.0656168 

2.36 

.4906133 

0252182 

1 91 

.4719334 

.0643777 

2.36 

.4908625 

0246313 

1 92 

,4725711 

0631666 

2 37 

.4911060 

0240666 

1.93 

.4731966 

.0619524 

2 38 

.4913437 

0234910 

1 94 

.4738102 

0607652 

2 39 

.4915758 

0229374 

1 96 

-4744119 

.0596947 

2.40 

.4918025 

.0223945 

1 96 

.4760021 

0584409 

2 41 

.4920237 

.0218624 

1 97 

.4765808 

.0573038 

2 42 

.4922397 

0213407 

1 98 

.4761482 

0561831 

2 43 

.4924506 

0208294 

1 99 

.4767045 

0550789 

2 44 

4926564 

0203284 

2 00 

.4772499 

0539910 

2 46 

.4928572 

.0198374 

2 01 

-4777844 

0529192 

2 46 

.4930631 

0193663 

2 02 

,4783083 

.0618636 

2 47 

.4932443 

0188850 

2 03 

.4788217 

0508239 

2 48 

.4934309 

.0184233 

2 04 

.4793248 

0498001 

2 49 

.4936128 

0179711 

2 06 

,4798178 

.0487920 

2 60 

.4937903 

.0175283 

2 06 

.4803007 

0477996 

2 51 

.4939634 

0170947 

2 07 

.4807738 

0468226 

2 62 

.4941323 

0166701 

2 08 

.4812372 

0458611 

2 63 

.4943001 

0162452 

2 09 

.4816911 

0449148 

2 54 

-4944574 

0158476 

2 10 

.4821356 

.0439836 

2 65 

.4946139 

.0164493 

2 11 

,4826708 1 

.0430674 

2 66 

.4947664 

0160596 

2 12 

,4829970 

0421661 

2 57 

.4949151 

0146782 

2 13 

.4834142 

.0412795 

2 68 

-4960600 

.0143061 

2 14 

.4838226 

0404076 

2 69 

-4962012 

.0139401 

2 16 

.4842224 

.0395500 

2 60 

-4953388 

.0136830 

2 16 

.4846137 

0387069 

2 61 

.4954729 

.0132337 

2 17 

.4849966 

0378779 

2 62 

-4966035 

0128921 

2 IS 

-4863713 

0370629 

2 63 

-4957308 

0126581 

2.19 

-4857379 

0362619 

2 64 

.4958547 

0122315 

2 20 

-4860966 

0354746 

2 65 

-4959764 

.0119122 

2 21 

-4864474 

0347009 

2 66 

.4960930 

.0116001 

2 22 

.4867906 

.0339408 

2 67 

.4962074 

0112961 

2 23 

.4871263 

0331939 

2 68 

4963189 

0109969 

2 24 

.4874545 j 

.0324603 

2 69 

4964274 

0107056 

2 26 

.4877756 

.0317397 

2 70 

,4966330 

0104209 


*BVoin Kent, “The Elements of Statistics.’' 
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Table 1.' — Area and Ordinate of the Normal Curvb.^* — {Continued) 


X/(T 

Area 

Ordinate ( 2 /) 

xftr 

Area 

Ordinate Cy) 

2 71 

4966358 

0101428 

3 16 

4992112 

0027075 

2.72 

4967359 

0098712 

3 17 

4992378 

0026231 

2 73 

.4968333 

0096058 

3 18 

4992636 

0026412 

2.74 

.4969280 

0093466 

3 19 

.4992886 

0024615 

2.75 

4970202 

.0090936 

3 20 

4993129 

0023841 

2 76 

4971099 

.0088465 

3 21 

4993363 

0023089 

2 77 

.4971972 

0086062 

3 22 

.4993590 

0022358 

2 78 

4972821 

0083697 

3 23 

.4993810 

0021649 

2 79 

.4973646 

0081398 

3 24 

4994024 

0020960 

2 80 

4974449 

.0079155 

3 25 

,4994230 

0020290 

2.81 

4975229 

.0076965 

3 26 

4994429 

0019641 

2 82 

.4975988 

.0074829 

3 27 

4994623 

.0019010 

2 83 

4976726 

.0072744 

3 28 

.4994810 

0018397 

2 84 

4977443 

.0070711 

3 29 

4994991 

0017803 

2.85 

.4978140 

.0068728 

3 30 

4996166 

0017226 

2 86 

.4978818 

0066793 

3 31 

.4996335 

0016666 

2 87 

.4979476 

0064907 

3 32 

4995499 

0016122 

2 88 

4980116 

0063067 

3 33 

4995658 

0015595 

2 89 

4980738 

0061274 

3 34 

.4995811 

0016084 

2 90 

.4981342 

0059525 

3 35 

.4996969 

.0014587 

2.91 

.4981929 

0057821 

3 36 

.4996103 

0014106 

2 92 

4982498 

0056160 

3 37 

4996242 

0013639 

2 93 

4983052 

0054541 

3 38 

4996376 

0013187 

2.94 

.4983589 

0052963 

3 39 

4996505 

0012748 

2 95 

.4984111 

0051426 

3 40 

4996631 

0012322 

2 96 

4984618 

0049929 

3 41 

4996762 

0011910 

2 97 

4985110 

0048470 

3 42 

.4996869 

0011610 

2 98 

4985588 

0047050 

3 43 

4996982 

0011122 

2 99 

4986051 

0045666 

3 44 

4997091 

0010747 

3 00 

4986501 

0044318 

3 45 

.4997197 

0010383 

3 01 

.4986938 

0043007 

3 46 

4997299 

0010030 

3 02 

4987361 

0041729 

3 47 

.4997398 

0009689 

3 03 

4987772 

0040486 

3 48 

,4997493 

0009358 

3 04 

4988171 

0039276 

3 49 

4997686 

0009037 

3 05 

4988558 

.0038098 

3 60 

4997674 

0008727 

3 06 

.4988933 

.0036951 

3 61 

.4997759 

0008426 

3 07 

4989297 

0035836 

3 52 

4997842 

0008135 

3 08 

.4989650 

0034751 

3 53 

4997922 

0007863 

3 09 

4989992 

0033695 

3 64 

4997999 

0007681 

3 10 

.4990324 

0032668 

3.55 

4998074 , 

0007317 

3 IX 

.4990646 

0031669 

3.66 

.4998146 

0007061 

3 12 

-4990957 

0030698 

3.67 

4998215 

0006814 

3 13 

-4991260 

0029754 

3 58 

4998282 

0006675 

3 14 

.4991653 

0028835 

3 69 

4998347 

.0006343 

3 15 

,4991836 

0027943 

3 60 

4998409 

0006119 


1 From Keat, “The Elements of Statistics.” 
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TiJ3LB 1 . — Area and Ordinate of the Normal Curve ^ — {Coniinued) 


x/tr 

Area 

Ordinate (jy) 

x/ff 

Area 

Ordinate (y) 

3 61 

4998469 

0006902 

4 06 

4999756 

0001051 

3 62 

4998627 

.0005693 

4 07 

.4999765 

0001009 

3 63 

4998583 

0005490 

4 03 

.4999775 

0000969 

3 64 

4998637 

0005294 

4 09 

4999784 

0000930 

3 66 

4998689 

0005105 

4 10 

4999793 

0000893 

3 66 

4998739 

0004921 

4 11 

.4999802 

0000857 

3 67 

4998787 

0004744 

4 12 

.4999811 

0000822 

3 68 

4998834 

.0004673 

4 13 

.4999819 

0000789 

3 69 

4998879 

.0004408 

4 14 

4999826 

0000767 

3 70 

4998922 

0004248 

4 16 

4999834 

0000726 

3 71 

4998964 

0004093 

4 16 

4999841 

0000697 

3 72 

.4999004 

0003800 

4 17 

4999848 

0000668 

3 73 

4999043 

.0003661 

4 18 

,4999854 

0000641 

3 74 

.4999080 

0003526 

4 19 

4999861 

0000616 

3 76 

4999116 

0003886 

4 20 

4999867 

0000589 

3 76 

4999150 

0003396 

4 21 

4999872 

0000565 

3 77 

4999184 

0003271 

4 22 

4999878 

0000542 

3 78 

4999216 

.0003149 

4 23 

4999883 

0000519 

3 79 

4999247 

0003032 

4 24 

4999888 

0000498 

3 80 

4999277 

0002919 

4 25 

.4999893 

0000477 

3 81 

4999305 

.0002810 

4 26 

4999898 

0000457 

3 82 

4999333 

0002705 

4 27 

4999902 

0000438 

3 83 

4999359 

0002604 

4 28 ^ 

4999907 

0000420 

3 84 

4999386 

0002506 

4 29 

4999911 

0000402 

3 85 

4999409 

0002411 

4 30 

4999915 

0000386 

3 86 

4999433 

0002320 

4 31 

4999918 

0000369 

3 87 

4999466 

0002232 

4 32 

4999922 

0000364 

3 88 

4999478 

0002147 

4 33 

4999925 

0000339 

3 89 

4999499 

0002065 

4 34 

4999929 

0000324 

3 90 

4999619 

0001987 

4 36 

.4999932 

0000310 

3 91 

4999639 

0001910 

4 36 

4999935 

0000297 

3 92 

4999557 

0001837 

4 37 

4999938 

0000284 

3 93 

4999576 

0001766 

4 38 

4999941 

0000272 

3 94 

4999593 

.0001698 

4 39 

4999943 

0000261 

3 96 

4999609 

0001633 

4 40 

4999946 

0000249 

3 96 

4999626 

0001569 

4 41 

.4999948 

0000239 

3 97 

4999641 

0001608 

4 42 

4999951 

0000228 

3 98 

4999656 

0001449 

4 43 

4999953 

0000218 

3 99 

4999670 

0001393 

4 44 

4999955 

0000209 

4 00 

4999683 

0001338 

4 45 

4999967 

0000200 

4 01 

4999696 

0001286 

4 46 

4999969 

0000191 

4 02 

4999709 

0001236 

4 47 

4999961 

0000183 

4 03 

4999721 

0001186 

4 48 

.4999963 

0000175 

4 04 

4999733 

0001140 

4 49 

4999964 

0000167 

4 05 

4999744 

0001094 

4 50 

4999966 

0000160 


' "Eiom. Kent, " The Elements ol Statistics ”, 
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04 04 04 04 04 

OnOt-<04CO 

C4 CO CO CO CO 

50 

*-n's0vot^T-t 
Ij^oo vo>^to 

OONO'<t<COC4 

CO CO CO CO CO 

wOOOnOn 
CO CO 
CO CO CO CO CO 

CJOOOOOOOl^ 
CO CO CO CO CO 
CO CO CO CO CO 

CO CO CO CO CO 
CO CO CO CO CO 

NO NO NO NO NO 

CO CO CO CO CO 

CO CO CO CO CO 


THCSCO-Nf^ 

to *01^ 00 On 
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r-H TO W c-1 

NONOt^OOO 

Ot^04CO'cH 
04 04 04 04 04 

to NO *100 ON 
e4C4C4e4e4 

70 

OOt<Nrf<tOO 

'eHr-H<SONO 

<0 

oo»-irvcoi^ 

CSt^CSONO 
00 NO NO CO C4 

00'«*IN0tOt-I 

•^coc4r4es 

toOCnOOC^ 

■c*l,-l0C4'0 
04CO-**tNoNO 
NO NO -cH CO 04 

041-lT-lcOt-N. 
00004 -cH NO 
ftr-lOCTNOO 

C4CMi.ri.C50 

ON TO 41*1.0 
ti-t-iNOVOtO 


THCse^N 

CO-^tON0t>. 

000\0\Ot-< 

0400-^ NO NO 

t>.OOONONO 

i-l,^r-lT-(04 

r-IC4C04tO 

C4e4c4C4e4 

80 

<s 

vO u-1 On CO 

O-iifONOco 

Oc^-^OOn 

t».esaNCOt^ 

OOONOCO»-t 

aN^'.'-*|^v^s. 

OOOconOO 
On 00 VO '■i* CO 

04 04 NO 00 
toONOi-ic-- 
T-iOoot^No 

101*11^040 

•cfH^CONOi* 

4<co^OOn 

OcOOOtOT* 

C40CiO*iNO 

ootito4co 


rHf-iC'l 

CO CO to NO 

S0tT.000NO 


to NO fl- CO CO 

Ti^ r-< tH 

ONOf-IC4CO 

T-IC4e4C4<4 

06 

OO 

T-tr-tOONOiM 

Oc^^J^ONO 

'<*<COOOOtO 

OcoOnnO'O 

CS|00'<*<1-100 

00'cHC40t^ 

tncooi^No 

<4NONOi-<CO 

T-^ooNONO-i^ 

COOCONO-CH 

Oi— 100 On CO 

1*1 -T*! il'NO J>. 
04000 NO 1* 

c4T*ONC5oaN 

ONt-ICONOC?N 

C4i-HO\ti-VO 


c-ir-» 

MCStcO'^'iJi 

to NO S'. t>- CO 

OnOO^C4 

r*KT“H r4 

CO 1*1* NO NO 

^ t-H t*4 

tiCOCSOONO 

rHr4i-tT-tC4 

95 

CO 

On 

COCOC>1»-INO 

Oi-^CO^^tH 

lot^comO 

CONOCOC4'!l< 
NO CO 0^ 

NONOCNJtM^ 

f-c^ONt^vo 
1OC4 00NOC4 

C4 04Ot^r-t 
NOt^IjNT-'NO 

OnnOcothoo 

r-ICCi-IOOt-C 

OncoOn’^t-' 

NOCOOOONO 

0Nt-<00C50CO 

titOC400N 

COt-»CM^T* 



n-(C4CSCOCO 

■rfCNONovOt^ 

i>-ooaNOO 

tH04 WCOI* 

to NO NO ti 00 

TH T-< T-< T— I T-* 

86 

000628 

0404 

185 

429 

752 

TH->t<r4c4 0N 
CO NO CO CO NO 

n-hnoOnoO 

ON 00 NO 00 NO 
Ot^NONOOO 
NOT-(t^COOS 

T*<tONOt^tN. 

tonoOnoco 
NO 04 0^ to 04 

NOOCOe41i. 
thOONOnOn 
On NO 04 ON NO 

On NO tin* NO 
Oc4T*ti.O 
t*t-(OOVOCO 



N«-l«--«C4r4CO 

COrH-rHNONO 

NOt^t^COON 

OnOt-Ht-(04 

COi*T*tONO 

Tl r-4 t1 t— < T— i 

o\ 

On 

0 

000157 

0201 

115 

297 

554 

C4ONN0 0000 
t^eo -1*100 NO 
OOC4NOOtO 

COT-tt^OON 

Not^OvOcs 

Oto,-4NOr4 

C4 00NOCOO 
1— lOr-ICONO 
00"*<ON0 04 

C4 NO Ne'e* 
On 1* ON NO 04 
OONOt-(OOno 

COCTnnonoco 

On ti NO VO VO 
T-100NOC4C7N 

fis 


»-<i-IC4C4 

COCO-r*<>tHNO 

tONOt--.t'.00 

ooonOOi-< 


8 


\or^ooo\0 

^c4cn>^^^No 

Nor'-ooONO 

tHC4COi*NO 

C4C4C4C4C4 

NOt-iOCONO 

C4C4C4C4CO 


For larger values of n, the expression '\/2x* — -v/ 2n — 1 may bo used as a normal deviate with unit standard deviation, 

* This table is taken by consent from Statistical Methods for Research Workers by Prof R A Fisher, by Oliver & Boyd, Edmburgh, and attention 
drawn to the larger collection in Statistical Tables by Prof R A Fisher and F Yates, by Oliver & Boyd, Edmburgh, 
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* From Davis and Nelson, Elements of StaHsttcs, p 31 By permission of the Cowles 
Commission for Research m Economics, Chicago For a larger table, see T. C Fry, Pro6- 
ahUxty and Its Engineering Uses, pp. 439-462. 
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TABiiB 4 .^ — ^Valtjes of the C0ERELA.T10N Coefficient foe Diffeeent 
Levels of Significance* 


p 

n 

.05 

01 

1 

996917 

9998766 

2 

95000 

990000 

3 

8783 

95873 

4 

8114 

91720 

5 

7545 

.8745 

6 

7067 

8343 

7 

6664 

7977 

8 

6319 

7646 

9 

6021 

7348 

10 

5760 

.7079 

11 

5529 

6835 

12 

5324 

6614 

13 

.5139 

6411 

14 j 

4973 

6226 

15 

.4821 

.6055 

16 

.4683 

5897 

17 

4555 

.5751 

18 

4438 

5614 

19 

4329 

.5487 

20 

4227 

5368 

25 

3809 

' .4869 

30 

3494 

4487 

35 

3246 

4182 

40 

3044 

3932 

45 

2875 

3721 

50 

2732 

3541 

60 

2500 

3248 

70 

2319 

3017 

80 

2172 

.2830 

90 

2050 

2673 

100 

1946 

2540 


For a total correlation, n is 2 less than the number of pairs in the sample; for a partial 
correlation, the number of eliminated vanates also should be subtracted 

♦This table is taken by consent from Stattstical Methods for Research WorTcers by Prof 
R. A. Fisher, by Ohver & Boyd, Edinburgh, and attention is drawn to the larger collection 
m Staitsttcal Tables by Prof. R. A. Fisher and F. Yates, by Oliver & Boyd, Edinburgh 
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Table 5 — Values of z for Given Values of r* 


r 

000 

001 

002 

003 

004 

005 

006 

007 

008 

009 

.000 

0000 

0010 

0020 

0030 

0040 

0050 

0060 

0070 

0080 

0090 

.010 

0100 

.0110 

0120 

0130 

0140 

0150 

0160 i 

0170 

0180 

0190 

,020 

0200 

0210 

0220 

0230 

0240 

0250 

0260 

0270 

0280 

0290 

.030 

0300 

0310 

0320 

0330 

0340 

0350 

.0360 

0370 

0380 

0390 

,040 

0400 

.0410 

0420 

0430 

0440 

0450 

.0460 

0470 

0480 

0490 

-060 

0501 

0511 

0521 

0531 

0541 

0551 

0561 

0571 

0581 

0691 

.060 

0601 

0611 

0621 

0631 

0641 

0651 

0661 

0671 

0681 

0691 

.070 

.0701 

0711 

0721 

0731 

.0741 

0751 

0761 

0771 

0782 

0792 

.080 

.0802 

0812 

0822 

0832 

0842 

0852 

0862 

0872 

0882 

0892 

,090 

0902 

.0912 

.0922 

0933 

0943 

0953 

.0963 

.0973 

0983 

0993 

.100 

' .1003 

1013 

1024 

1034 

.1044 

.1054 

1064 

.1074 

1084 

1094 

.110 

1106 

1115 

1125 

1135 

.1145 

.1155 

1165 

1175 

1185 

1195 

.120 

.1206 

1216 

1226 

1236 

1246 

1257 

1267 

1277 

1287 

1297 

.130 

.1308 

1318 

.1328 

1338 

1348 

1358 

.1368 

1379 

1389 

1399 

,140 

1409 

.1419 

1430 

1440 

1450 

: 1460 

1470 

1481 

1491 

1501 

.150 

1611 

1522 

1532 

1542 

1552 

1563 

1573 

1583 

1593 

1604 

.160 

1614 

1624 

1634 

1654 

1655 

.1665 

1676 

1686 

1696 

1706 

.170 

1717 

1727 

.1737 

1748 

.1758 

1768 

.1779 

1789 

: 1799 

1810 

180 

.1820 

1830 

1841 

1851 

1861 

1872 

1882 

1892 

1903 

1913 

m190 

.1923 

1934 

.1944 

.1954 

.1965 

1975 

i 1986 

1996 

2007 

2017 

200 

2027 

.2038 

2048 

.2059 

2069 

2079 

2090 

2100 

2M1 

2121 

210 

2132 

2142 

2153 

2163 

2174 

2184 

2194 

2205 

2215 

2226 

220 

2237 

2247 

2258 

2268 

2279 

2289 

2300 

2310 

2321 

2331 

230 

2342 

.2353 

2363 

2374 

.2384 

2395 

2405 

2416 

2427 

2437 

240 

.2448 

2468 

.2469 

2480 

2490 

.2501 

1 

! 2511 

2522 

1 

2533 

2643 

260 

2554 

2565 

2575 

2586 

2597 

2608 

2618 

! 2629 

2640 

2650 

-260 

2661 

2672 

2682 

2693 

2704 

2715 

2726 

2736 

2747 

2758 

.370 

2769 

2779 

2790 

2801 

2812 

2823 

2833 

2844 

2855 

2866 

.280 

2877 

.2888 

2898 

2909 

2920 

2931 

2942 

2953 

2964 

2975 

.290 

.2986 

2997 

.3008 

3019 

3029 

i 3040 

.3051 

3062 

3073 

1 3084 

.300 

3095 

.3106 

3117 

3128 

3139 

3150 

3161 

3172 

1 3183 

I 3195 

.310 

3206 

3217 

3228 

3239 

3250 

3261 

3272 

3283 

: 3294 

3305 

-320 

3317 

3328 

3339 

3350 

3361 

3372 

.3384 

3395 

3406 

3417 

-330 

3428 

3439 

3451 

3462 

3473 

.3484 

3496 

3507 

3518 

3530 

.340 

3541 

.3562 

3664 

3575 

3586 

.3597 

3609 

3620 

3632 

.3643 

350 

3664 

3666 

.3677 

3689 

.3700 

.3712 

3723 

3734 

3746 

3757 

.360 

3769 

3780 

.3792 

3803 

.3815 

3826 

.3838 

3850 

3861 

3873 

.370 

3884 

.3896 

3907 

3919 

3931 

3942 

3954 

3966 

3977 

3989 

380 

4001 

4012 

.4024 

4036 

4047 

.4059 

4071 

i .4083 

4094 

4106 

.390 

4118 

4130 

.4142 

4153 

.4165 

4177 

.4189 

.4201 

4213 

4225 

.400 

4236 

.4248 

4260 

4272 

4284 

.4296 

.4308 

4320 

4332 

4344 

.410 

4356 

4368 

.4380 

4392 

.4404 

.4416 

,4429 

4441 

4453 

4465 

.420 

4477 

4489 

4501 

4513 

4526 

4538 

.4550 

4562 

4574 

4587 

43'^0 

.4699 

4611 

4623 

4636 

4648 

.4660 

4673 

4685 

4697 

4710 

440 

.4722 

4736 

4747 

4760 

4772 

.4784 

,4797 

4809 

4822 

4835 

450 

4847 

.4860 

4872 

4885 

4897 

4910 

4923 

4935 

4948 

4961 

.460 

4973 

.4986 

4999 

5011 

5024 

5037 

5049 

5062 

5075 

5088 

.470 

.6101 

.6114 

5126 

5139 

5152 

.5165 

.5178 

.5191 

5204 

5217 

.480 

6230 

5243 

.6266 

5279 

5282 

5295 

5308 

5321 

5334 

6347 

.490 

.6361 

6374 

6387 

5400 

5413 

5427 

5440 

.5453 

5466 

5480 


* From Albert E Waugb, Laboratory Manual and Problems for Elements of Statisttcal 
Method, pp 32-33, McGraw-Hill Book Company, Inc , New York. 
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Table 5 — Vaiots of z for Given Values of r.' — (Continued) 


r 

000 

001 

.002 

003 

004 

005 

006 

007 

008 

009 

.600 

5493 

5506 

5520 

6533 

6547 

5560 


6573 

6687 


6600 

5614 

.610 

5627 

6641 

6654 

6668 

6681 

5695 


6709 

6722 


6736 

5750 

620 

5763 

6777 

.6791 

6805 

.6818 

6832 


5846 

5860 


6874 

5888 

630 

5901 

6915 

.5929 

5943 

.6957 

.6971 


5985 

6999 


6013 

6027 

540 

6042 

.6056 

.6070 

6084 

.6098 

6112 


.6127 

6141 


6155 

6170 

,650 

6184 

.6198 

.6213 

.6227 

.6241 

6256 


6270 

6285 


6299 

6314 

.660 

6328 

6343 

.6368 

.6372 

6387 

6401 


6416 

6431 


6446 

6460 

,670 

6475 

6490 

6505 

.6620 

.6535 

6550 


.6565 

6679 


6594 

6610 

.680 

6625 

.6640 

.6655 

,6670 

6686 

6700 


.6715 ' 

6731 


6746 

6761 

.690 

6777 1 

.6792 

.6807 ^ 

.6823 

.6838 

.6864 


.6869 

6885 


6900 

6916 

.600 

6931 

.6947 

.6963 

.6978 

.6994 ! 

.7010 


.7026 

.7042 


7057 

7073 

.610 

7089 

.7105 ; 

.7121 

.7137 

.7163 1 

.7169 


.7185 

.7201 1 


7218 

7234 

,620 

7250 

.7266 

.7283 

.7299 

.7315 

.7332 1 


.7348 

7364 


7381 

7398 

,630 

7414 

7431 

7447 

.7464 

.7481 

7497 


.7614 

7531 


7548 

7565 

.640 

7682 

.7599 

7616 

.7633 

.7650 

.7667 


.7684 

.7701 


7718 

7736 

-660 

7753 

7770 

7788 

.7805 

.7823 

.7840 


7868 

7875 


.7893 

7910 

.660 

.7928 

7946 

7964 

.7981 

.7999 

.8017 


8035 

8053 


8071 

8089 

.670 

8107 

.8126 

8144 

8162 

.8180 

8199 


.8217 

8236 


8254 

8273 

,680 

8291 

.8310 

8328 

.8347 

.8366 

8385 


.8404 

8423 


.8442 

8461 

.690 

8480 

.8499 

8518 

8637 

.8566 

8576 


.8595 

8614 


8634 

8653 

.700 

8673 

.8693 

.8712 

.8732 

.8752 

8772 


.8792 

8812 


8832 

8852 

.710 

8872 

.8892 

8912 

8933 

.8953 

8973 


.8994 

! 9014 


9035 

9056 

,720 

9076 

.9097 

9118 

.9139 

.9160 

9181 


.9202 

1 .9223 


9245 

9266 

.730 

9287 

9309 

9330 

9352 

.9373 

.9395 


9417 

' 9439 


9461 

9483 

.740 

9505 

.9527 

.9549 

.9671 

.9694 

9616 


9639 

9661 


9684 

9707 

.760 

9730 

9762 

9776 

9799 

9822 

9846 


9868 

i 9892 


9916 

9939 

.760 

9962 

9986 

1 0010 

1 0034 

1.0058 

1 0082 

1 

0106 

1 0130 

1 

0154 

1 0179 

.770 

1 0203 

1 0228 

1 0253 

1 0277 

1 0302 

1 0327 

1 

0362 

1 0378 

1 

0403 

1 0428 

.780 

1 0454 

1 0479 

1 0505 

1 0531 

1 0657 

1 0583 

1 

0609 

1 0635 

1 

0661 

1 0688 

.790 

1 0714 

1 0741 

1 0768 

1 0795 

1 0822 

1 0849 

1 

0876 

1 0903 

1 

0931 

1 0958 

.800 

1 0986 

1 1014 

1 1041 

1 1070 

1 1098 

1 1127 

1 

1165 

1 1184 

1 

1212 

1 1241 

.810 

1 1270 

1 1299 

1 1329 

1 1358 

1 1388 

1 1417 

1 

1447 

1 1477 

1 

1607 

1 1538 

.820 

1 1668 

1 1599 

1 1630 

1 1660 

1 1692 

1 1723 

1 

1754 

1 1786 

1 

1817 

1 1849 

.830 

ll 1870 

1 1913 

1 1946 

1 1979 

1 2011 

1 2044 

1 

2077 

1 2111 

1 

2144 

1 2178 

.840 

1 2212 

1 2246 

1 2280 

1 2316 

1 2349 

1 2384 

1 

2419 

1 2454 

1 

2490 

1 2526 

.850 

1 2661 

1 2598 

1 2634 

1 2670 

1 2708 

1 2744 

1 

2782 

1 2819 

1 

2857 

1 2895 

.860 

1 2934 

1 2972 

1 3011 

1 3050 

1 3089 

1 3129 

1 

3168 

1 3209 

1 

3249 

1 3290 

.870 

1 3331 

1 3372 

1 3414 

1 3456 

1 3498 

1 3540 

1 

35S3 

1 3626 

1 

3670 

1 3714 

.880 

1 3768 

1 3802 

1 3847 

1 3892 

1 3938 

1 3984 

1 

4030 

1 4077 

1 

4124 

1 4171 

,890 

1 4219 

1 4268 

1 4316 

1 4366 

1 4416 

1 4465 

1 4516 

1 4566 

1 

4618 

1 4670 

.900 

1 4722 

1 4775 

1 4828 

1 4883 

1 4937 

1 4992 

1 

6047 

1 6103 

1 

6160 

1 5217 

.910 

1 6275 

1 6334 

1 5393 

1 5453 

1 6513 

1 5674 

1 

6636 

1 6698 

1 

6762 

1 5825 

.920 

1 5890 

1 6956 

1 6022 

1 6089 

1 6157 

1 6226 

1 

6296 

1 6366 

1 

6438 

1 6510 

-930 

1 6684 

1 6669 

1 6734 

1 6811 

1 6888 

1.6967 

1 

7047 

1 7129 

1 

7211 

1 7295 

.940 

1 7380 

1 7467 

1 7555 

1 7645 

1 7736 

1 7828 

1 

7923 

1 8019 

1 

8117 

1 8216 

.960 

1 8318 

1 8421 

1 8527 

1 8635 

1.8745 

1 8857 

1 

8972 

1 9090 

1 

9210 

1 9333 

,960 

1 9459 

1 9588 

1 9721 

1 9857 

1 9996 

2 0140 

2 0287 

2 0439 

2 

0595 

2 0756 

.970 

2 0923 

2 1095 

2 1273 

2 1457 

2 1649 

2 1847 

2 

2054 

2 2269 

2 

2494 

2 2729 

.980 

2 2976 

2 3223 

2,3507 

2 3796 

2 4101 

2 4426 

2 4774 

2 6147 

2 

5550 

2 5988 

.990 

2 6467 

2 6996 

2 7687 

2 8267 

2 9031 

2 9945 

3 

1063 1 

3 2504 

3 

4534 

3 8002 


r z 


.9999 4 95172 

.99999 6 10303 
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Foreword to Table 6. — ^To extract the square root of any number, we 
begm at the decimal point and group the figures by pairs in both directions. 
For example, 7,500,000,000,000 becomes 07 50 00 00 00 00 00. In 
Table 6 we look up the figure, 750. Its square root is seen to be 27.3801. 
We allow one figure m the root for each pair of figures m the number. There 
are seven pairs to the left of the decimal in our number and none to the nght 
of the decimal, so the root will contam seven figures to the left of the decimal, 
thus: 2,738,610 In lookmg up a square root, never separate the figures in 
a pair In our illustration it would be wrong to find the square root of the 
number 75 or of the number 7500. 

When the square root of a large number (e g , 7,583,615,000,000) cannot 
be foimd exactly from the table, the nearest approximation is often taken 
{e g., take the square root of 7,580,000,000,000 as roughly equivalent to the 
square root of 7,583,615,000,000). Where greater accuracy is required, a 
larger table may be used (see Barlovfs Tables of Squares, Cubes, Square 
Boots, Cube Roots, Reciprocals of All Integer Numbers Up to 10,000, Spon and 
Chamberlam, 120 Liberty Street, New York), or any elementary textbook in 
algebra may be consulted for the method of extractmg a square root Cal- 
culatmg machine compames furnish pamphlets describmg how to extract 
a square root on their machmes. A shde rule gives approximate square 
^ roots easily and rapidly. 
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Table 6' — Squares and Square Roots* 


Number 

Square 

Square root 

1 

1 

1 0000 

2 

i 4 

1 4142 

8 

9 

1 7321 

4 

16 

2 0000 

6 

25 

2 2361 

6 

36 

2 4495 

7 

49 

2 6458 

8 

64 

2 8284 

9 

81 

3 0000 

10 ! 

100 

3 1623 

11 

121 

3 3166 

12 

144 ' 

3 4641 

13 

169 

3 6056 

14 

196 

3 7417 

15 

225 

3 8730 

16 

256 

4 0000 

17 

289 

4 1231 

18 

324 

4 2426 

19 

361 

4 3589 

20 

400 

4 4721 

21 

441 

4 6826 

22 

484 

4 6904 

23 

529 

4 7958 

24 

576 

4 8990 

25 

625 

5 0000 

26 

676 

5 0990 

27 

729 

5 1962 

28 

784 

6 2915 

29 

841 

5 3852 

30 

900 

5 4772 

31 

961 

6 5678 

32 

1024 

5 6569 

33 

1089 

5 7446 

34 

1156 

5 8310 

35 

1225 

5 9161 

36 

1296 

6 0000 

37 

1369 

6 0828 

38 

1444 

6 1644 

39 

1521 

6 2450 

40 

1600 

6 3246 


Number 

Square 

Square root 

41 

1681 

6 4031 

42 

1764 

6 4807 

43 

1849 

6 5574 

44 

1936 

6 6332 

45 

2025 

6 7082 

46 

2116 

6 7823 

47 

1 2209 

6 8557 

48 

2304 

6 9282 

49 

2401 

7 0000 

50 

2500 

7 0711 

51 

2601 

7 1414 

52 

2704 

7 2111 

53 

2809 

7 2801 

54 

2916 

7 3485 

55 

3025 

7 4162 

56 

3136 

7 4833 

67 

3249 

7 5498 

58 

3364 

7 6158 

59 

3481 

7 6811 

60 

3600 

7 7460 

61 

3721 

7 8102 

62 

3844 

7 8740 

63 

3969 

7 9373 

64 

4096 i 

8 0000 

65 

4225 

8 0623 

66 

4356 

8 1240 

67 

4489 

8 1854 

68 

4624 

8 2462 

69 i 

4761 

8 3066 

70 i 

4900 

8 3666 

71 

5041 

8 4261 

72 

5184 

8 4853 

73 

5329 

8 5440 

74 

5476 

8 6023 

75 

5625 

8 6603 

76 

5776 

8 7178 

77 

5929 

8 7750 

78 

6084 

8 8318 

79 

6241 

8 8882 

80 

6400 

8 9443 


♦ IVom Herbert Sorenson, Statzsttca for Stvdents of Psychology and Educatzon, pp 347-359, 
McGraw-Hill Book Company, Inc , New York 
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Table 6 — Sqtjakbs and Squaee Roots — (Continued) 


Number 

Square 

Square root 


Square 

Square root 

81 



121 

14641 

11 0000 

82 

6724 

9 0554 

122 

14884 

11 0454 

83 


9 1104 

123 

15129 

11 0905 

84 


9 1652 

124 

15376 

11 1355 

85 

7225 

9 2195 

125 

15625 

11 1803 

86 

7396 

9 2736 

126 

15876 

11 2250 

87 

7569 

9 3274 

127 

16129 

11 2694 

88 

7744 

9 3808 

128 

16384 

11 3137 

89 

7921 

9 4340 

129 

16641 

11 3578 

90 


9 4868 

130 

16900 

11 4018 

91 

8281 

9 5394 

131 

n-ei 

11 4455 

92 

8464 

9 5917 

132 

17424 

11 4891 

93 

8649 

9 6437 

133 

17689 

11 5326 

94 

8836 

9 6954 

134 

17956 

11 5758 

95 


9 7468 

136 

18225 

11 6190 

96 

9216 

9 7980 

136 

18496 

11.6619 

97 

9409 

9 8489 

137 

18769 

11 7047 

98 


9 8995 

138 

19044 

11 7473 

99 

9801 

9 9499 

139 

19321 

11 7898 

100 


10 0000 

140 

19600 

11 8322 

101 


10 0499 

141 

19881 

11 8743 

102 

10404 

10 0995 

142 

20164 

11 9164 

103 


10 1489 

143 

20449 

11 9583 

104 


10 1980 

144 

i 20736 

12 0000 

105 

11025 

10 2470 

145 

21025 

12 0416 

106 

11236 

10 2956 

146 

1 21316 

12 0830 

107 

11449 

10 3441 

147 

21609 

12 1244 

108 

11664 

10 3923 

148 

21904 

12 1655 

109 

11881 

10 4403 

149 

22201 

12 2066 

110 


10 4881 

150 

22500 

12 2474 

111 

12321 

10 5357 

151 

22801 

12 2882 

112 

12544 

10 5830 

152 

23104 

12 3288 

113 

12769 

10 6301 

153 

1 23409 

12 3693 

114 

12996 

10 6771 

154 

23716 

12 4097 

115 

13225 

10 7238 

155 

24025 

12 4499 

116 

13456 

10 7703 

156 

24336 

12 4900 

117 

13689 

10 8167 

157 

24649 

12 5300 

118 


10 8628 

158 

24964 

12 5698 

119 

14161 

10 9087 

159 

25281 

12 6095 

120 


10.9545 

160 

25600 

12 6491 

1 
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Tabub 6.' — Squares and Square Roots — (Conhnued) 


Number 

Square 

Square root 

Number 

1 Square 

Square root 

161 

25921 

12 6886 

201 

40401 

14 1774 

162 

26244 

12 7279 

202 

' 40804 

14 2127 

163 

26569 

12 7671 

203 

41209 

14 2478 

164 

26896 

12 8062 

204 

41616 

14 2829 

166 

27225 

12 8452 

205 

42025 

14 3178 

166 

27556 

12 8841 

206 

42436 

14 3527 

167 

27889 

12 9228 

207 

42849 

14 3875 

168 

28224 

12 9615 

208 

43264 

14 4222 

169 

28561 

13 0000 

209 

43681 

14 4568 

170 

28900 

13 0384 

210 

44100 

14 4914 

171 

29241 

13 0767 

211 

44521 

14 5258 

172 

29584 

13 1149 

212 

44944 

14 5602 

173 

29929 

13 1529 

213 

45369 

14 5945 

174 

30276 

13 1909 

214 

45796 

14 6287 

175 

30625 

13 2288 

215 

46225 

14 6629 

176 

30976 

13 2665 

216 

46656 

14 6969 ^ 

177 

31329 

13 3041 

217 

47089 

14 7309 

178 

31684 

13 3417 

218 

47524 

14 7648 

179 

32041 

13 3791 

219 

47961 

14 7986 

180 

32400 

13 4164 

220 

48400 

14 8324 

181 

32761 i 

13 4536 

221 

48841 

14 8661 

182 

33124 1 

13 4907 

222 

49284 

14 8997 

183 

33489 

13 5277 

223 

49729 

14 9332 

184 

33856 

13 5647 

224 

50176 

i 14 9666 

185 

34225 

13 6015 

225 

50625 

15 0000 

186 ! 

34596 

13 6382 

226 

51076 

15 0333 

187 1 

34969 

13 6748 

227 

51529 

15 0665 

188 

35344 

13 7113 

228 

51984 

15 0997 

189 

35721 

13 7477 

229 j 

52441 

15 1327 

190 1 

36100 

13 7840 

230 , 

52900 

15 1658 

191 

36481 

13 8203 

231 

53361 

15 1987 

192 

36864 

13 8564 

232 

53824 

15 2315 

193 

37249 

13 8924 

233 

54289 

15 2643 

194 

37636 

13 9284 

234 

54756 

15 2971 

195 

38025 

13 9642 

235 

55225 

15 3297 

196 

38416 

14.0000 

236 

55696 

15 3623 

197 

38809 

14 0357 

237 

56169 

15 3948 

198 

39204 

14 0712 

238 

56644 

15 4272 

199 

39601 

14 1067 

239 

57121 

15 4596 

200 

40000 

14 1421 

240 

57600 

15 4919 
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Table 6 — Squares and Square Roots- — {Conhnued) 
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Table 6. — Squares and Square Roots ^ — {Continued) 


Nuiriber 

Square 

Square root 

Number 

321 

103041 

17 9165 

361 

322 

103684 

17 9444 

362 

323 

104329 

17 9722 

363 

324 

104976 

18 0000 

364 

325 

105625 

18 0278 

365 

326 

106276 

18 0555 

366 

327 

106929 

18 0831 

367 

328 

107584 

18 1108 

368 

329 

108241 

18 1384 

369 

330 

108900 

18.1659 

370 

1 

331 

109561 

18 1934 

371 

332 

110224 

18 2209 

372 

333 

110889 

18 2483 

373 

334 

111556 

18.2757 

374 

335 

112225 

18 3030 

376 

336 

112896 

18 3303 

376 

337 1 

113569 

18 3576 

377 

338 1 

114244 

18 3848 

378 

339 

114921 

18 4120 

379 

340 

115600 

18 4391 

380 

^ 341 

116281 

18 4662 

381 

342 

116964 

18 4932 

382 

343 

117649 

18 5203 

383 

344 

118336 

18 5472 

384 

345 

119025 

18 5742 

385 

346 

119716 

18 6011 

386 

347 

120409 

18 6279 

387 1 

348 

121104 

18 6548 

388 

349 

121801 

18 6815 

389 

350 

122500 

18 7083 

390 

* 351 

123201 

18.7350 

391 

352 

123904 

18 7617 

392 

353 

124609 

18 7883 

393 

354 

125316 

18 8149 

394 

355 

126025 

18 8414 

395 

356 

126736 

18 8680 

396 

357 

127449 

18 8944 

397 

358 

128164 

18 9209 

398 

359 

128881 

18 9473 

399 

360 

129600 

18 9737 

400 


130321 

131044 

131769 

132496 

133225 

133956 

134689 

135424 

136161 

136900 

137641 

138384 

139129 


141376 

142129 

142884 

143641 

144400 

145161 

145924 

146689 

147456 

148225 

148996 

149769 

150544 

151321 

152100 

152881 

153664 

154449 

155236 

156025 

156816 

157609 

158404 

159201 

160000 


19 0000 
19 0263 
19 0526 
19 0788 
19 1050 
19 1311 
19 1572 
19 1833 
19 2094 
19 2354 

19 2614 
19 2873 
19 3132 
19 3391 
19 3649 
19 3907 
19 4165 
19 4422 
19 4679 
19 4936 

19 5192 
19 5448 
19 5704 
19 5959 
19 6214 
19 6469 
19 6723 
19 6977 
19 7231 
19 7484 

19 7737 
19 7990 
19 8242 
19 8494 
19 8746 
19 8997 
19 9249 
19 9499 

19 9750 

20 0000 
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Table 6 — Squajibs and Sqtjabe Roots. — {Continued) 


160801 

161604 

162409 

163216 

164025 

164836 

165649 

166464 

167281 

168100 

168921 

169744 

170569 

171396 

172225 

173056 

173889 

174724 

175561 

176400 

177241 

178084 

178929 

179776 

180625 

181476 

182329 

183184 

184041 

184900 

185761 

186624 

187489 

188356 

189225 

190096 

190969 

191844 

192721 

193600 



Number 

Square 

Square root 

20 0250 

441 

194481 

21.0000 

20 0499 

442 

195364 

21 0238 

20 0749 

443 

196249 

21 0476 

20 0998 

444 

197136 

21 0713 

20 1246 

445 

198025 

21.0950 

20 1494 

446 

198916 

21.1187 

20 1742 

447 

199809 

21.1424 

20 1990 

448 

200704 

21 1660 

20 2237 

449 

201601 

21 1896 

20 2485 

450 

202500 

21 2132 

20 2731 

451 

203401 

21 2368 

20 2978 

452 

204304 

21.2603 

20 3224 

453 

205209 

21.2838 

20 3470 

454 

206116 

21 3073 

20 3715 

455 

207025 

21 3307 

20 3961 

456 

207936 

21 3542 

20 4206 

457 

208849 

21 3776 

20 4450 

458 

209764 

21 4009 

20 4695 

459 

210681 

21.4243 

20 4939 

460 

211600 

21 4476 

20 5183 

461 

212521 

21 4709 

20 5426 

462 

213444 

21 4942 

20 5670 

463 

214369 

21 5174 

20 5913 

464 

215296 

21 5407 

20 6155 

465 

216225 

21 5639 

20 6398 

466 

217156 

21 5870 

20 6640 

467 

218089 

21 6102 

20 6882 

468 : 

219024 

21 6333 

20 7123 

469 1 

219961 

21 6564 

20 7364 

470 

220900 

21 6795 

20 7605 

471 ' 

221841 

21 7025 

20 7846 

472 

222784 

21 7256 

20 8087 

473 

223729 

21 7486 

20 8327 

474 

224676 

21 7715 

20 8567 

475 

225625 

21 7945 

20 8806 

476 

226576 

21 8174 

20 9045 

477 

227529 

21 8403 

20 9284 

478 

228484 

21 8632 

20 9523 

479 

229441 

21 8861 

20 9762 

480 

230400 

21 9089 
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Table 6.' — Squabbs and Square Roots — (Conhnued) 


Number 

Square 

Square root 

Number 

Square 

Square root 

481 

231361 

21 9317 

521 

271441 

22 8254 

482 

232324 

21 9545 

522 

272484 

22 8473 

483 

233289 

21 9773 

523 

273529 

22 8692 

484 

234256 

22 0000 

524 

274576 

22 8910 

485 

235225 

22 0227 

525 

275625 

22 9129 

486 

236196 

22 0454 

526 

276676 

22 9347 

487 

237169 

22 0681 

527 

277729 

22 9565 

488 

238144 

22 0907 

628 

278784 

22,9783 

489 

239121 

22.1133 

529 

279841 

23 0000 

490 

240100 

22.1359 

1 530 

280900 

23.0217 

491 

241081 

22 1585 

531 

281961 

23 0434 

492 

242064 

22 1811 

532 

283024 

23 0651 

493 

243049 

22 2036 

533 

284089 

23 0868 

494 

244036 

22 2261 

534 

285156 

23 1084 

495 

245025 

22 2486 

535 

286225 

23 1301 

496 

246016 

22 2711 

536 

287296 

23 1517 ' 

497 

247009 

22 2935 

537 

288369 

23 1733 

498 

248004 

22 3169 

538 

289444 1 

23 1948 

499 

249001 

22 3383 

639 

290521 

23 2164 

600 

250000 

22 3607 

540 

291600 1 

23 2379 

601 

251001 

22 3830 

541 

292681 

23 2594 

602 

252004 

22 4054 

542 

293764 

23 2809 

603 

253009 

22 4277 

543 

294849 

23.3024 

604 

254016 

22 4499 

544 

295936 

23 3238 

605 

255025 

22 4722 

545 

297025 

23 3452 

506 

256036 

22 4944 

546 

298116 

23 3666 

607 

257049 

22 5167 

547 

299209 

23 3880 

508 

258064 

22 5389 

548 

300304 

23 4094 

609 

259081 

22 5610 

549 

301401 

23 4307 

610 

260100 

22 6832 

650 

302500 

23 4521 

511 

261121 

22 6053 

651 

303601 

23 4734 

612 

262144 

22 6274 

552 

304704 

23 4947 

613 

263169 

22 6495 

653 

305809 

23 5160 

614 

264196 

22 6716 

554 

306916 

23 5372 

515 

265225 

22 6936 

555 

308025 

23 5584 

516 

266256 

22 7156 

556 

309136 

23 5797 

517 

267289 

22 7376 

557 

310249 

23 6008 

518 

268324 

22 7696 

658 

311364 

23 6220 

519 

269361 

22 7816 

559 

312481 

23 6432 

520 

270400 

22 8035 

660 

313600 

23 6643 
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Table 6 — Squares and Square Roots — (Conttmied) 




Square root 


Square 

Square root 

561 

314721 

23 6854 

601 

361201 

24 5153 

562 1 

315844 

23 7065 

602 

362404 

24 5357 

563 

316969 

23 7276 

603 

363609 

24 5561 

564 

318096 

23 7487 

604 

364816 

24 5764 

565 

319225 

23 7697 

605 

366025 

24 5967 

566 

320356 

23 7908 

606 

367236 

24 6171 

567 

321489 

23 8118 

607 

368449 i 

24 6374 

568 

322624 

23 8328 

608 

369664 

24 6577 

569 

323761 

23 8537 

609 

370881 : 

24 6779 

570 

324900 

23 8747 

610 

372100 

24 6982 

571 

326041 

23 8956 

611 

373321 

24 7184 

572 

327184 

23 9165 

612 

374544 

24,7385 

573 

328329 

23 9374 

613 

375769 

24 7588 

574 

329476 

23 9583 

614 

376996 

24 7790 

575 

330625 

23 9792 

615 

378225 

24 7992 

576 

331776 

24 0000 

616 

379456 

24 8193 

677 

332929 

24 0208 

617 

380689 

24 8395 

678 

334084 

24 0416 

618 

381924 

24 8596 

579 

335241 

24 0624 

619 

383161 

24 8797 

580 

336400 

24 0832 

620 

384400 

24 8998 

581 

337561 

24 1039 

621 

385641 

24 9199 

582 

338724 

24 1247 

622 

386884 

24 9399 

583 

339889 

24 1454 

623 

388129 

24 9600 

584 

341056 

24 1661 

624 

389376 

24 9800 

585 

342225 

24 1868 

625 

390625 

25 0000 

686 

343396 

24 2074 

626 

391876 

25 0200 

587 

344569 

24 2281 

627 

393129 

25 0400 

588 

345744 

24 2487 

628 

394384 

25 0599 

589 

346921 

24 2693 

629 

395641 

25 0799 

590 

348100 

24 2899 

630 

396900 

25 0998 

691 

349281 

24 3105 

631 

i 398161 

25 1197 

592 

350464 

24 3311 

632 

399424 

25 1396 

593 

351649 

24 3516 

633 

‘ 400689 

25 1595 

594 

352836 

24 3721 

634 

401956 

25 1794 

595 

354025 

24 3926 

635 

403225 

25,1992 

696 

355216 

24 4131 

636 

404496 

25 2190 

597 

356409 

24 4336 

637 

405769 

25 2389 

698 

357604 

24 4540 

638 

407044 

25 2587 

599 

358801 

24 4745 

639 

408321 

25 2784 

600 

360000 

24 4949 

640 

409600 

25 2982 
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Table 6. — Sqtjabes and Sqtjabe Roots.- — {Continued) 


Number Square Square root Number Square Square root 


25 3180 
25 3377 
25 3574 
25 3772 
25 3969 
25 4165 
25 4362 
25 4558 
25 4755 
25 4951 



463761 

465124 

466489 

467856 

469225 

470596 

471969 

473344 

474721 

476100 

477481 

478864 

480249 

481636 

483025 

484416 

485809 

487204 

488601 

490000 

491401 

492804 

494209 

495616 

497025 

498436 

499849 

501264 

502681 

504100 


26 0960 
26 1151 
26 1343 
26 1534 
26 1725 
26 1916 
26 2107 
26 2298 
26 2488 
26 2679 

26 2869 
26 3059 
26 3249 
26 3439 
26 3629 
26 3818 
26 4008 
26 4197 
26 4386 
26 4575 

26 4764 
26 4953 
26 5141 
26 5330 
26 5518 
26 5707 
26 5895 
26 6083 
26 6271 
26 6458 

26 6646 
26 6833 
26 7021 
26 7208 
26 7395 
26 7582 
26 7769 
26 7955 
26 8142 
26 8328 
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Table 6 — Squabes and Sqtjaee Eoots. — (Continued) 


Number 

Square 

Square root 


Square 

Square root 

721 

519841 

26 8514 

761 

579121 

27 5862 

722 

521284 

26 8701 

762 

580644 

27 6043 

723 

522729 

26 8887 

763 

582169 

27 6225 

724 ! 

524176 

26 9072 

764 

583696 

27 6405 

725 

525625 

26 9258 

765 

585225 

27 6586 

726 

527076 

26 9444 

766 

586756 

27 6767 

727 

528529 

26 9629 

767 

588289 

27 6948 

728 

529984 

26 9815 

768 

589824 

27.7128 

729 

531441 

27 0000 

769 

591361 

27 7308 

730 

532900 

27 0185 

770 1 

592900 

27 7489 

731 

634361 

27 0370 

771 

594441 

27 7669 

732 

535824 

27 0555 

772 

595984 

27 7849 

733 

537289 

27 0740 

773 

597529 

27 8029 

734 

538756 

27 0924 

774 

599076 

27 8209 

735 

540225 

27 1109 

775 

600625 

27 8388 

736 

541696 

27 1293 

776 

602176 

27 8568 

737 

543169 

27 1477 

777 

603729 

27 8747 

738 

544644 ! 

27 1662 

778 

605284 

27 8927 

739 

546121 

27 1846 

779 

606841 

27 9106 

740 

647600 

27 2029 

780 

608400 

27 9285 

741 

549081 

27 2213 

781 

609961 

27 9464 

742 

550564 

27 2397 

782 

611524 

27 9643 

743 

552049 

27 2580 

783 

613089 

27 9821 

744 

553536 

27 2764 

784 

614656 

28 0000 

745 

555025 

27 2947 

785 

616225 

28 0179 

746 

556516 

27 3130 

786 

617796 

28 0357 

747 

558009 

27 3313 

787 

619369 

28 0535 

748 

559504 

27 3496 

788 

620944 

28 0713 

749 

561001 

27 3679 

789 

622521 

28 0891 

750 

562500 

27 3861 

790 

624100 

28 1069 

751 

564001 

27 4044 

791 

625681 

28 1247 

752 

565504 

27 4226 

792 

627264 

28 1425 

753 

567009 

27 4408 

793 

628849 

28 1603 

754 

568516 

27 4591 

794 

630436 

28 1780 

755 

570025 

27 4773 

795 

632025 

28 1957 

756 

' 571536 

27 4955 

796 

633616 

28 2135 

757 

573049 

27 5136 

797 

635209 

28 2312 

758 

574564 

27 5318 

798 

636804 

28 2489 

759 

576081 

27 5500 

799 

638401 

28 2666 

760 

577600 

27 5681 

800 

640000 

28 2843 



320 


ELEMENTARY SOCIAL STATISTICS 


Table 6. — Sqtiasbs aito Squabe Roots. — {Continued) 


Number 

Square 

Square root 

Number 

Square 

Square root 

801 

641601 

28 3019 

841 

707281 

29 0000 

802 

643204 

28 3196 

842 

708964 

29 0172 

803 

644809 

28 3373 

843 

710649 

29 0345 

804 

646416 

28 3549 

844 

712336 

29 0517 

806 

648025 

28 3725 

845 

714025 

29 0689 

806 

649636 

28 3901 

846 

716716 

29 0861 

807 

651249 

28 4077 

847 

717409 

29 1033 

808 

652864 

28 4253 

848 

719104 

29 1204 

809 

654481 

28 4429 

849 

720801 

29 1376 

810 

656100 

28 4605 

850 

722500 

29 1548 

811 

657721 

28 4781 

851 

724201 

29 1719 

812 

659344 

28 4956 

852 

725904 

29 1890 

813 

660969 

28 5132 

853 

727609 

29 2062 

814 

662596 

28 5307 

854 

729316 

29 2233 

815 

664225 

28 5482 

855 

731025 

29 2404 

816 

665856 

28 5657 

856 

732736 

29 2575 

817 

667489 

28 5832 

857 

734449 

29 2746 

818 

669124 

28 6007 

858 

736164 

29 2916 

819 

670761 

28 6182 

859 

737881 

29 3087 

820 

672400 

28 6356 

860 

739600 

29 3258 

821 

674041 

28 6531 

861 

741321 

29 3428 

822 

675684 

28 6705 

862 

743044 

29 3598 

823 

677329 

28 6880 

863 

744769 

29 3769 

824 

678976 

28 7054 

864 

746496 

29 3939 

825 

680625 

28.7228 

865 

748225 

29 4109 

826 

682276 

28 7402 

866 

749956 

29 4279 

827 

683929 

28 7576 

867 

761689 

29 4449 

828 

685584 

28 7750 

868 

753424 

29 4618 

829 

687241 

28 7924 

869 

755161 

29 4788 

830 

688900 

28 8097 

870 

756900 

29 4958 

831 

690561 

28 8271 

871 

758641 

29 5127 

832 

1 692224 

28 8444 

872 

760384 

29 5296 

833 

693889 

28 8617 

873 

762129 

29 5466 

834 

695556 

28 8791 

874 

763876 

29 5635 

835 

697225 

28 8964 

875 

765625 

29 5804 

836 

698896 

28 9137 

876 

767376 

29 5973 

837 

700569 

28 9310 

877 

769129 

29 6142 

838 

702244 

i 28 9482 

878 

770884 

29 6311 

839 

703921 

28 9655 

879 

772641 

29 6479 

840 

705600 

28 9828 

880 

774400 

29 6648 
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Table 6. — Squaebs and Sqdaeb Roots — {Continued) 


776161 

777924 

779689 

781456 

783225 

784996 

786769 

788544 

790321 

792100 

793881 

795664 

797449 

799236 

801025 

802816 

804609 

806404 

808201 

810000 

811801 

813604 

815409 

817216 

819025 

820836 

822649 

824464 

826281 

828100 

829921 

831744 

833569 

835396 

837225 

839056 

840889 

842724 

844561 

846400 


29 6816 
29 6985 
29 7153 
29 7321 
29 7489 
29 7658 
29 7825 
29 7993 
29 8161 
29 8329 

29 8496 
29 8664 
29 8831 
29 8998 
29 9166 
29 9333 
29 9500 
29.9666 

29 9833 

30 0000 

30 0167 
30 0333 
30 0500 
30 0666 
30 0832 
30 0998 
30 1164 
30 1330 
30 1496 
30 1662 

30 1828 
30 1993 
30 2159 
30 2324 
3(S 2490 
30 2655 
30 2820 
30 2985 
30 3150 
30 3315 
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Table 6 — Squares and Squaeb Roots — {Continued) 


Number 

Square 

Square root 

961 

923521 

31 0000 

962 

925444 

31.0161 

963 

927369 

31 0322 

964 

929296 

31 0483 

965 

931225 

31.0644 

966 

933156 

31 0805 

967 

935089 

31 0966 

968 

937024 

31 1127 

969 

938961 

31.1288 

970 

940900 

31.1448 

971 

942841 

31 1609 

972 

944784 

31 1769 

973 

946729 ^ 

31 1929 

974 

948676 I 

31 2090 

975 

950625 

31 2250 

976 

952576 

31 2410 

977 

954529 

31 2570 

978 

956484 

31 2730 

979 

958441 

31 2890 

980 

960400 

31 3050 




Square roc 

981 

962361 

31 3209 

982 

964324 

31 3369 

983 

966289 I 

31 3528 

984 

968256 ! 

31 3688 

985 

970225 1 

31 3847 

986 

972196 i 

31 4006 

987 

974169 

31 4166 

988 

976144 

31 4325 

989 

978121 

31 4484 

990 

980100 

31 4643 

991 

982081 

31 4802 

992 

984064 

31 4960 

993 

986049 

31 5119 

994 

988036 

31 5278 

995 

990025 

31 5436 

996 

992016 

31 5595 

997 

994009 

31 5753 

998 

996004 

31 5911 

999 

998001 

31 6070 

1000 

1000000 

31 6228 
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Foreword to Table 7. iLogaritbras are tbe greatest labor-saving discovery 
ever made m the field of mathematics. With their aid, many calculations 
can be performed easily and quickly that would not be feasible at all without 
them. 

The common logarithm of a number is the power to which 10 must be 
raised to produce that number For example, lO^ =['100, so the logarithm 
of 100 is 2 Similarly, 90 = 10^ and the logarithm of 90 is 1.95424. 
In general, if F = 10=®, then log Y = x 

That part of the logarithm to the left of the decimal is called the charac- 
tenstic, while that part to the right of the decimal is the mantissa. Thus, 
for log 10 = 1 95424, the characteristic is 1, the mantissa is 95424 

There are three fundamental prmciples that are constantly needed in 
working with logarithms 

1. log (ah) = log a + log 6, 

2 log (a/h) = log a log 6, 

3 log (a”) = n log a. 

To find the mantissa of any number, we enter Table 7, find the first three 
digits of the number m the left-hand column, headed “No and find the 
fourth digit in the top row, then read off the mantissa from the proper row 
and column. Thus, for the number 1,503, we find 150 m the first column 
and “3” m the fifth column of the table, and read off the mantissa .17696. 

The characteristic of a logarithm is discovered by placmg the pencil pomt 
between the first two significant figures (the first figure that is not zero is the 
first significant figure) of the number, and movmg it to the right or left so 
many places to the decimal pomt. If the pencil is moved to the right, the 
characteristic is positive, if to the left, it is negative Thus, for the number 
1,503, the pencil is placed between 1 and 5, and moved three places to the 
right The characteristic is therefore 3, and the complete logarithm is 
3.17696. 

If the number is 15,030, the mantissa is the same, but the characteristic 
IS 4, so that the logarithm is 4 17696 

In the case of the number 15,037, the exact mantissa cannot be read 
directly from Table 7, but an approximate mantissa can be obtained by 
taking the mantissa of the number 1,504, or, more accurately, by interpolat- 
ing To mterpolate, we subtract the mantissa of the number just smaller 
than the given number from the mantissa of the number just larger, then 
subtract from the given number the table number just smaller, place a 
decimal pomt before the first figure of this last difference, multiply the first 
difference by this value, and add the product to the mantissa of the table 
number just smaller than the given number. Thus, the mantissa of 15,030 
IS 17696, the mantissa of 15,040 is 17725, and their difference is 00029, the 
difference between the given number and the table number just smaller is 
15,037 — 15,030 = 7, which becomes 7, the product of the first difference 
and this value is .00029 X .7 = 000203, and this product added to the 
mantissa of 15,030 is .17696 + .000203 = .17716. The logarithm of 15,037 
is therefore 4 17716. 

Suppose that the number whose logarithm is required is 15 037. The 
mantissa is the same as that just found for the number 15,037, but the 
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characteristie clianges from 4 to 1, so that the logarithm is now 1 17716. 
Similarly, the loganthm of 1 5037 is 0 17716 

If we move the decimal one or more places to the left, givmg, say, the 
number 15037, we do not change the mantissa, but we encounter a negative 
characteristic For, if we put a pencil between 1 and 5, we must move it 
one place to the left to reach the decimal pomt The characteristic is then 
— 1, To avoid these awkward negative characteristics, it is customary to 
write —1 in the form 9 . . . — 10, —2 m the form 8 . . . . —10, etc. 

Hence, the loganthm of 15037 is written 9.17716 — 10. 

To obtain from Table 7 the number correspondmg to a logarithm, we 
find m the table the mantissa of the loganthm, write down the number 
correspondmg to it, and then point off this number m accordance with the 
characteristic of the logarithm. For example, given the logarithm 2 27921, 
we look in the table for the mantissa .27921, read off the correspondmg 
number 1,902, and point off this number by placing our pencil point between 
the figures 1 and 9, then movmg it two places to the right as mdicated by the 
positive characteristic, 2, gettmg 190.2 as the result. If the loganthm is 
0.27921, the number is 1 902, if the loganthm is 8 27921 — 10, the number 
IS 0 01902, and so on. 

Let us now find a geometric mean by the use of logarithms. By formula 
(14) of Chap. VII, 

(? = (5 • 11 • 19)i*. 


Accordmg to prmciple 3, above, 

log(? = Jlog(5 11 19). 

And by prmciple 1, 


log (? = I (log 5 + log 11 + log 19). 

Now, the numbers 5, 11, and 19 do not appear in Table 7, but the numbers 
5,000, 1,100, and 1,900, which have the same mantissas, may be found there 
The mantissa of 5,000 is .69897, and the characteristic of 5 is 0, so log 5 is 

0 69897 In the same way, we find log 11 = 1.04139, and log 19 — 1.27875. 
We therefore have 

log - i (0 69897 + 1 04139 + 1 27875) = | (3.01911) 

= 1 00637. 

Looking in the table for the mantissa 00637, the nearest we can find to it is 
the mantissa 00647, to which the correspondmg number is 1,015 Pomtmg 
this off accordmg to the characteristic, 1, we get 10 15 as the geometric 
mean. If greater accuracy is wanted, we may interpolate in the table Our 
mantissa, 00637, falls between the two tabular mantissas .00604 and .00647. 
We therefore have 00637 - 00604 = .00033, 00647 - .00604 == .00043; 
and .00033/.00043 — .767. That is, our matissa mdicates a number about 

1 of the way between 10 14 and 10.15, or roughly 10.14 + .00767 = 10.14767. 
*5 • 11- 19 » 6 X 11 X 19. 
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Table 7.' — 'Five-place Common Logarithms op N'tjmbees 

100-149 


m 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 

100 

00 

000 

00 

043 

00 

087 

00 

130 

00 

173 

OO 

217 

00 

260 

00 

303 

00 

346 

00 

389 

101 

00 

432 

00 

475 

00 

518 

00 

561 

00 

604 

oo 

647 

00 

689 

OO 

732 

on 

775 

00 

817 

102 

00 

860 

00 

903 

00 

945 

00 

988 

01 

030 

01 

072 

01 

115 

01 

157 

01 

199 

01 

942 

103 

01 

284 

01 

326 

01 

368 

01 

410 

01 

452 

01 

494 

01 

536 

01 

578 

01 

620 

01 

662 

104 

01 

703 

01 

745 

01 

787 

01 

828 

01 

870 

01 

912 

01 

953 

01 

995 

02 

036 

02 

078 

105 

02 

119 

02 

160 

02 

202 

02 

243 

02 

284 

02 

325 

02 

366 

02 

407 

02 

449 

02 

490 

106 

02 

531 

02 

572 

02 

612 

02 

653 

02 

694 

02 

735 

02 

776 

02 

816 

02 

857 

02 

898 

107 

02 

938 

02 

979 

03 

019 

03 

060 

03 

100 

03 

141 

03 

181 

03 

222 

03 

262 

03 

302 

108 

03 

342 

03 

383 

03 

423 

03 

463 

03 

503 

03 

543 

03 

583 

03 

623 

03 

663 

03 

703 

109 

03 

743 

03 

782 

03 

822 

03 

862 

03 

902 

03 

941 

03 

981 

04 

021 

04 

060 

04 

100 

110 

04 

139 

04 

179 

04 

218 

04 

258 

04 

297 

04 

336 

04 

376 

04 

415 

04 

464 

04 

493 

111 

04 

532 

04 

671 

04 

610 

04 

650 

04 

689 

04 

727 

04 

766 

04 

805 

04 

844 

04 

883 

112 

04 

922 

04 

961 

04 

999 

05 

038 

05 

077 

05 

115 

05 

154 

05 

192 

06 

231 

05 

269 

113 

05 

308 

05 

346 

05 

385 

05 

423 

05 

461 

05 

500 

06 

638 

06 

576 

05 

614 

05 

662 

114 

05 

690 

05 

729 

05 

767 

05 

805 

05 

843 

05 

881 

05 

918 

05 

966 

05 

994 

06 

032 

113 

06 

070 

06 

108 

06 

145 

06 

183 

06 

221 

06 

258 

06 

296 

06 

333 

06 

371 

06 

408 

116 

06 

446 

06 

483 

06 

521 

06 

558 

06 

595 

06 

633 

06 

670 

06 

707 

06 

744 

06 

781 

117 

06 

819 

06 

856 

06 

893 

06 

930 

06 

967 

07 

004 

07 

041 

07 

078 

07 

115 

07 

161 

118 

07 

188 

07 

225 

07 

262 

07 

298 

07 

335 

07 

372 

07 

408 

07 

445 

07 

482 

07 

618 

119 

07 

665 

07 

691 

07 

628 

07 

664 

07 

700 

07 

737 

07 

773 

07 

809 

07 

846 

07 

882 

120 

07 

918 

07 

954 

07 

990 

08 

027 

08 

063 

08 

099 

08 

135 

08 

171 

08 

207 

08 

243 

121 

08 

279 

08 

314 

08 

350 

08 

386 

08 

422 

08 

458 

08 

493 

08 

529 

08 

565 

08 

600 

122 

08 

636 

08 

672 

08 

707 

08 

743 

08 

778 

08 

814 

08 

849 

08 

884 

08 

920 

08 

965 

123 

08 

991 

09 

026 

09 

061 

09 

096 

09 

132 

09 

167 

09 

202 

09 

237 

09 

272 

09 

307 

124 

09 

342 

09 

377 

09 

412 

09 

447 

09 

482 

09 

517 

09 

652 

09 

687 

09 

621 

09 

666 

125 

09 

691 

09 

726 

09 

760 

09 

795 

09 

830 

09 

864 

09 

899 

09 

984 

09 

968 

10 

003 

126 

10 

037 

10 

072 

10 

106 

10 

140 

10 

175 

10 

209 

10 

243 

10 

278 

10 

312 

10 

346 

127 

10 

380 

10 

415 

10 

449 

10 

483 

10 

517 

10 

551 

10 

585 

10 

619 

10 

653 

10 

687 

128 

10 

721 

10 

755 

10 

789 

10 

823 

10 

857 

10 

890 

10 

924 

10 

958 

10 

992 

11 

025 

129 

11 

069 

11 

093 

11 

126 

11 

160 

11 

193 

11 

227 

11 

261 

11 

294 

11 

327 

11 

361 

130 

11 

394 

11 

428 

11 

461 

11 

494 

11 

528 

11 

561 

11 

594 

11 

628 

11 

661 

11 

694 

131 

11 

727 

11 

760 

11 

793 

11 

826 

11 

860 

11 

893 

11 

926 

11 

959 

11 

992 

12 

024 

132 

12 

057 

12 

090 

12 

123 

12 

156 

12 

189 

12 

222 

12 

264 

12 

287 

12 

320 

12 

352 

133 

12 

385 

12 

418 

12 

450 

12 

483 

12 

516 

12 

548 

12 

681 

12 

613 

12 

646 

12 

678 

134 

12 

710 

12 

743 

12 

775 

12 

808 

12 

840 

12 

872 

12 

905 

12 

937 

12 

969 

13 

001 

135 

13 

033 

13 

066 

13 

098 

13 

130 

13 

162 

13 

194 

13 

226 

13 

268 

13 

290 

13 

322 

136 

13 

354 

13 

386 

13 

418 

13 

450 

13 

481 

13 

513 

13 

545 

13 

577 

13 

609 

13 

640 

137 

13 

672 

13 

704 

13 

735 

13 

767 

13 

799 

13 

830 

13 

862 

13 

893 

13 

925 

13 

956 

138 

13 

988 

14 

019 

14 

061 

14 

082 

14 

114 

14 

145 

14 

176 

14 

208 

14 

239 

14 

270 

139 

14 

301 

14 

333 

14 

864 

14 

895 

14 

426 

14 

457 

14 

489 

14 

620 

14 

661 

14 

582 

140 

14 

613 

14 

644 

14 

675 

14 

706 

14 

737 

14 

768 

14 

799 

14 

829 

14 

860 

14 

891 

141 

14 

922 

14 

953 

14 

983 

15 

014 

15 

045 

15 

076 

15 

106 

15 

137 

15 

168 

15 

198 

142 

15 

229 

15 

259 

15 

290 

16 

320 

15 

351 

15 

381 

15 

412 

15 

442 

15 

473 

15 

503 

143 

15 

534 

15 

564 

15 

694 

15 

625 

15 

655 

15 

685 

15 

715 

15 

746 

15 

776 

15 

806 

144 

15 

836 

15 

866 

15 

897 

15 

927 

15 

957 

15 

987 

16 

017 

16 

047 

16 

077 

16 

107 

145 

16 

137 

16 

167 

16 

197 

16 

227 

16 

256 

16 

286 

16 

316 

16 

346 

16 

376 

16 

406 

146 

16 

435 

16 

465 

16 

495 

16 

524 

16 

554 

16 

584 

16 

613 

16 

643 

16 

673 

16 

702 

147 

16 

732 

16 

761 

16 

791 

16 

820 

16 

850 

16 

879 

16 

909 

16 

938 

16 

967 

16 

997 

148 

17 

026 

17 

056 

17 

085 

17 

114 

17 

143 

17 

173 

17 

202 

17 

231 

17 

260 

17 

289 

149 

17 

319 

17 

348 

17 

377 

17 

406 

17 

435 

17 

464 

17 

493 

17 

622 

17 

561 

17 

680 

No. 


0 


1 


2 


8 


4 


5 


6 


7 


8 


9 


100-149 
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Table 7 — Five-place Common Logabithms op Nembees. — {Continued) 

200-249 



■ 


1 



2 


3 


4 


5 


6 


7 


8 


9 

200 

30 

103 

30 

125 

30 

146 

30 

168 

30 

190 

30 

211 

30 

233 

30 

255 

SO 

276 

30 

298 

201 

30 

320 

30 

341 

30 

363 

30 

384 

30 

406 

30 

428 

30 

449 

30 

471 

30 

492 

30 

514 

202 

30 

535 

30 

557 

30 

578 

30 

600 

30 

621 

30 

643 

30 

664 

30 

685 

30 

707 

30 

'728 

203 

30 

7^0 

30 

771 

30 

792 

30 

814 

30 

835 

30 

856 

30 

878 

30 

899 

30 

920 

30 

942 

204 

30 

963 

30 

984 

31 

006 

31 

027 

31 

048 

31 

069 

31 

091 

31 

112 

31 

133 

31 

154 

205 

31 

175 

31 

197 ' 

31 

218 

31 

239 

31 

260 

31 

281 

31 

302 

31 

323 

31 

34§ 

31 

366 

206 

31 

387 

31 

408 

31 

429 

31 

450 

31 

471 

31 

492 

31 

513 

31 

534 

31 

555 

31 

576 

207 

31 

597 

31 

618 

31 

639 

31 

660 

31 

681 

31 

702 

31 

723 

31 

744 

31 

765 

31 

785 

208 

31 

806 

31 

827 

31 

848 

31 

869 

31 

890 

31 

911 

31 

931 

31 

952 

31 

973 

31 

994 

209 

32 

015 

32 

035 

32 

056 

32 

077 

32 

098 

32 

118 

32 

139 

32 

160 

32 

181 

32 

201 

210 

32 

222 

32 

243 

32 

263 

32 

284 

32 

305 

32 

325 

32 

346 

32 

366 

32 

387 

32 

408 

211 

32 

428 

32 

449 

32 

469 

32 

490 

32 

510 

32 

531 

32 

552 

32 

572 

32 

593 

32 

613 

212 

32 

634 

32 

654 

32 

675 

32 

695 

32 

715 

32 

736 

32 

756 

32 

777 

32 

797 

32 

818 

213 

32 

838 

32 

858 

32 

879 

32 

899 

32 

919 

32 

940 

32 

960 

32 

980 

33 

001 

33 

021 

214 

33 

041 

33 

062 

33 

082 

33 

102 

33 

122 

33 

143 

33 

163 

33 

183 

33 

203 

33 

224 

i 

215 

33 

244 

33 

264 

33 

284 

33 

304 

33 

325 

33 

345 

33 

365 

33 

385 

33 

405 

33 

425 

216 

33 

445 

33 

465 

33 

486 

33 

506 

33 

526 

33 

546 

33 

566 

33 

586 

33 

606 

33 

626 

217 

33 

646 

33 

666 

33 

686 

33 

706 

33 

726 

33 

746 

33 

766 

33 

786 

33 

806 

33 

826 

218 

33 

846 

33 

866 

33 

885 

33 

905 

33 

925 

33 

945 

33 

965 

33 

985 

34 

005 

34 

025 

^9 

34 

044 

34 

064 

34 

084 

34 

104 

34 

124 

34 

143 

34 

163 

34 

183 

34 

203 

34 

223 

220 

34 

242 

34 

262 

34 

282 

34 

301 

34 

321 

34 

341 

34 

361 

34 

380 

34 

400 

34 

420 

221 

34 

439 

34 

459 

34 

479 

34 

498 

34 

518 

34 

537 

34 

557 

34 

577 

34 

596 

34 

616 

222 

34 

635 

34 

655 

34 

674 

34 

694 

34 

713 

34 

733 

34 

753 

34 

772 

34 

792 

34 

811 

223 

34 

830 

34 

850 

34 

869 

34 

889 

34 

908 

34 

928 

34 

947 

34 

967 

34 

986 

35 

005 

224 

35 

025 

35 

044 

35 

064 

35 

083 

35 

102 

35 

122 

35 

141 

35 

160 

35 

180 

35 

199 

225 

35 

218 

35 

238 

35 

257 

35 

276 

35 

295 

35 

315 

85 

334 

35 

353 

35 

372 

35 

392 

226 

35 

411 

35 

430 

35 

449 

35 

468 

35 

488 

35 

507 

35 

526 

35 

545 

35 

564 

35 

583 

227 

35 

603 

35 

622 

35 

641 

35 

660 

35 

679 

35 

698 

35 

717 

35 

736 

35 

755 

35 

774 

228 

35 

793 

35 

813 

35 

832 

35 

851 

35 

870 

35 

889 

35 

908 

35 

927 

35 

946 

35 

965 

229 

35 

984 

36 

003 

36 

021 

36 

040 

36 

059 

36 

078 

36 

097 

36 

116 

36 

135 

36 

154 

230 

36 

173 

36 

192 

36 

211 

36 

229 

36 

248 

36 

267 

36 

286 

36 

305 

36 

324 

36 

342 

231 

36 

361 

36 

380 

36 

399 

36 

418 

36 

436 

36 

455 

36 

474 

36 

493 

36 

511 

36 

530 

232 

36 

549 

36 

568 

36 

586 

36 

605 

36 

624 

36 

642 

36 

661 

36 

680 

36 

698 

36 

717 

233 

36 

736 

36 

754 

36 

773 

36 

791 

36 

810 

36 

829 

36 

847 

36 

866 

36 

884 

36 

903 

234 

36 

922 

36 

940 

36 

959 

1 

36 

977 

36 

996 

37 

014 

37 

033 

37 

051 

37 

070 

37 

088 

235 

37 

107 

37 

125 

37 

144 

37 

162 

37 

181 

37 

199 

37 

218 

37 

236 

37 

254 

37 

273 

236 

37 

291 

37 

310 

37 

328 

37 

346 

37 

365 

37 

383 

37 

401 

37 

420 

37 

438 

37 

457 

237 

37 

475 

37 

493 

37 

511 

37 

530 

37 

548 

37 

566 

37 

585 

37 

603 

37 

621 

37 

639 

238 

37 

658 

37 

676 

37 

694 

37 

712 

37 

731 

37 

749 

37 

767 

37 

783 

37 

803 

37 

822 

239 

37 

840 

37 

858 

37 

876 

37 

894 

37 

912 

37 

931 

37 

949 

37 

967 

37 

985 

38 

003 

240 

38 

021 

38 

039 

38 

057 

38 

075 

38 

093 

38 

112 

38 

130 

38 

148 

38 

166 

38 

184 

241 

38 

202 

38 

220 

38 

238 

38 

256 

38 

274 

38 

292 

38 

310 

38 

328 

38 

346 

38 

364 

242 

38 

382 

38 

399 

38 

417 

38 

435 

38 

453 

38 

471 

38 

489 

38 

507 

38 

52g 

38 

543 

243 

38 

561 

38 

578 

38 

596 

38 

614 

38 

632 

38 

650 

38 

668 

38 

686 

38 

703 

38 

721 

244 

38 

739 

38 

757 

38 

775 

38 

792 

38 

810 

38 

828 

38 

846 

38 

863 

38 

881 

38 

899 

245 

38 

917 

38 

934 

38 

952 

38 

970 

38 

987 

39 

005 

39 

023 

39 

041 

39 

058 

39 

076 

246 

39 

094 

39 

111 

39 

129 

39 

146 

39 

164 

39 

182 

39 

199 

39 

217 

39 

235 

39 

252 

247 

39 

270 

39 

287 

39 

305 

39 

322 

39 

340 

39 

358 

39 

375 

39 

393 

39 

410 

39 

428 

248 

39 

445 

39 

463 

39 

480 

39 

498 

39 

515 

39 

533 

39 

550 

39 

568 

39 

585 

39 

602 

249 

39 

620 

39 

637 

39 

655 

39 

672 

39 

690 

39 

707 

39 

724 

39 

742 

39 

759 

39 

777 

Ko 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 
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260-299 



250 

39 

794 

39 

811 

39 

829 

39 

846 

39 

863 

39 

881 

39 

898 

39 

915 

39 

933 

39 

950 

251 

39 

967 

39 

985 

40 

002 

40 

019 

40 

037 

40 

054 

40 

071 

40 

088 

40 

106 

40 

123 

252 

40 

140 

40 

157 

40 

175 

40 

192 

40 

209 

40 

226 

40 

243 

40 

261 

40 

278 

40 

295 

253 

40 

312 

40 

329 

40 

346 

40 

364 

40 

381 

40 

398 

40 

415 

40 

432 

40 

449 

40 

466 

254 

40 

483 

40 

500 

40 

518 

40 

535 

40 

652 

40 

569 

40 

586 

40 

603 

40 

620 

40 

637 

255 

40 

654 

40 

671 

40 

688 

40 

705 

40 

722 

40 

739 

40 

756 

40 

773 

40 

790 

40 

807 

256 

40 

824 

40 

841 

40 

858 

40 

875 

40 

892 i 

40 

909 

40 

926 

40 

943 

40 

960 

40 

976 

257 

40 

993 

41 

010 

41 

027 

41 

044 

41 

061 

41 

078 

41 

095 

41 

111 

41 

128 

41 

145 

258 

41 

162 

41 

179 

41 

196 

41 

212 

41 

229 

41 

246 

41 

263 

41 

280 

41 

296 

41 

313 

269 

41 

330 

41 

347 

41 

363 

41 

380 

41 

397 

41 

414 

41 

430 

41 

447 

41 

464 

41 

481 

260 

41 

497 

41 

514 

41 

531 

41 

547 

41 

664 

41 

581 

41 

697 

41 

614 

41 

631 

41 

647 

261 

41 

664 

41 

681 

41 

697 

41 

714 

41 

731 

41 

747 

41 

764 

41 

780 

41 

797 

41 

814 

262 

41 

830 

41 

847 

41 

863 

41 

880 

41 

896 

41 

913 

41 

929 

41 

946 

41 

963 

41 

979 

263 

41 

996 

42 

012 

42 

029 

42 

045 

42 

062 

42 

078 

42 

095 

42 

111 

42 

127 

42 

144 

264 

42 

160 

42 

177 

42 

193 

42 

210 

42 

226 

42 

243 

42 

259 

42 

275 

42 

292 

42 

308 

265 

42 

32 § 

42 

341 

42 

357 

42 

374 

42 

390 

42 

406 

42 

423 

42 

439 

42 

455 

42 

472 

266 

42 

488 

42 

504 

42 

521 

42 

537 

42 

553 

42 

570 

42 

586 

42 

602 

42 

619 

42 

635 

267 

42 

651 

42 

667 

42 

684 

42 

700 

42 

716 

42 

732 

42 

749 

42 

765 

42 

781 

42 

797 

268 

42 

813 

42 

830 

42 

846 

42 

862 

42 

878 

42 

894 

42 

911 

42 

927 

42 

943 

42 

959 

269 

42 

975 

42 

991 

43 

008 

43 

024 

43 

040 

43 

056 

43 

072 

43 

088 

43 

104 

43^120 

270 

43 

136 

43 

152 

43 

169 

43 

185 

43 

201 

43 

217 

43 

233 

43 

249 

43 

265 

43 

281 

271 

43 

297 

43 

313 

43 

329 

43 

345 

43 

361 

43 

377 

43 

393 

43 

409 

43 

425 

43 

441 

272 

43 

457 

43 

473 

43 

489 

43 

505 

43 

521 

43 

637 

43 

553 

43 

669 

43 

584 

43 

600 

273 

43 

616 

43 

632 

43 

648 

43 

664 

43 

680 

43 

696 

43 

712 

43 

727 

43 

743 

43 

759 

274 

43 

775 

43 

791 

43 

807 

43 

823 

43 

838 

43 

854 

43 

870 

43 

886 

43 

902 

43 

917 

275 

43 

933 

43 

949 

43 

965 

43 

981 

43 

996 

44 

012 

44 

028 

44 

044 

44 

069 

44 

076 

276 

44 

091 

44 

107 

44 

122 

44 

138 

44 

154 

44 

170 

44 

185 

44 

201 

44 

217 

44 

232 

277 

44 

248 

44 

264 

44 

279 

44 

295 

44 

311 

44 

326 

44 

342 

44 

358 

44 

373 

44 

389 

278 

44 

404 

44 

420 

44 

436 

44 

451 

44 

467 

44 

483 

44 

498 

44 

614 

44 

529 

44 

545 

279 

44 

560 

44 

576 

44 

592 

44 

607 

44 

623 

44 

638 

44 

654 

44 

669 

44 

685 

44 

700 

280 

44 

716 

44 

731 

44 

747 

44 

762 

44 

778 

44 

793 

44 

809 

44 

824 

44 

840 

44 

855 

281 

44 

871 

44 

886 

44 

902 

44 

917 

44 

932 

44 

948 

44 

963 

44 

979 

44 

994 

45 

010 

282 

45 

025 

45 

040 

45 

056 

45 

071 

45 

086 

45 

102 

45 

117 

45 

133 

45 

148 

46 

163 

283 

45 

179 

45 

194 

45 

209 

45 

225 

45 

240 

45 

255 

45 

271 

45 

286 

45 

301 

45 

317 

284 

45 

332 

45 

347 

45 

362 

45 

378 

45 

393 

45 

408 

45 

423 

45 

439 

45 

454 

45 

469 

285 

45 

484 

45 

500 

45 

51 § 

45 

530 

45 

545 

45 

561 

45 

676 

45 

591 

45 

606 

45 

621 

286 

45 

637 

45 

652 

45 

667 

45 

682 

45 

697 

45 

712 

45 

728 

45 

743 

45 

758 

45 

773 

287 

45 

788 

45 

803 

45 

818 

45 

834 

45 

849 

45 

864 

45 

879 

45 

894 

45 

909 

45 

924 

288 

45 

939 

45 

954 

45 

969 

45 

984 

46 

000 

46 

015 

46 

030 

46 

045 

46 

060 

46 

075 

289 

46 

090 

46 

105 

46 

120 

46 

135 

46 

150 

46 

165 

46 

180 

46 

195 

46 

210 

46 

225 

200 

46 

240 

46 

255 

46 

270 

46 

285 

46 

300 

46 

315 

46 

330 

46 

345 

46 

369 

46 

374 

291 

46 

389 

46 

404 

46 

419 

46 

434 

46 

449 

46 

464 

46 

479 

46 

494 

46 

509 

46 

523 

292 

46 

538 

46 

553 

46 

568 

46 

583 

46 

698 

46 

613 

46 

627 

46 

642 

46 

657 

46 

672 

293 

46 

687 

46 

702 

46 

716 

46 

731 

46 

746 

46 

761 

46 

776 

46 

790 

46 

805 

46 

820 

294 

46 

m 

46 

850 

46 

864 

46 

879 

46 

894 

46 

909 

46 

923 

46 

938 

46 

953 

46 

967 

295 

46 

982 

46 

997 

47 

012 

47 

026 

47 

041 

47 

056 

47 

070 

47 

085 

47 

100 

47 

114 

296 

47 

129 

47 

144 

47 

159 

47 

173 

47 

188 

47 

202 

47 

217 

47 

232 

47 

246 

47 

261 

297 

47 

276 

47 

290 

47 

305 

47 

319 

47 

334 

47 

349 

47 

363 

47 

378 

47 

392 

47 

407 

298 

47 

422 

47 

436 

47 

451 

47 

463 

47 

480 

47 

494 

47 

609 

47 

524 

47 

538 

47 

553 

299 

i 

567 

47 

582 

47 

596 

47 

611 

47 

625 

47 

640 

47 

654 

47 

669 

47 

683 

47 

698 
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Table 7. — Fivb-place Common Looaeithms OB' Ntimbees. — {CoTiMnued) 

300-349 


No. 


0 


1 


2 


3 


i 


5 


6 


7 


8 


9 

soo 

47 

712 

47 

727 

47 

741 

47 

756 

47 

770 

47 

784 

47 

799 

47 

813 

47 

828 

47 

842 

SOI 

47 

857 

47 

871 

47 

885 

47 

900 

47 

914 

47 

929 

47 

943 

47 

958 

47 

972 

47 

986 

S02 

48 

001 

48 

013 

48 

029 

48 

044 

48 

058 

48 

073 

48 

087 

48 

101 

48 

116 

48 

130 

303 

48 

144 

48 

159 

48 

173 

48 

187 

48 

202 

48 

216 

48 

230 

48 

244 

48 

259 

48 

273 

304 

48 

287 

48 

302 

48 

316 

48 

330 

48 

344 

48 

359 

48 

373 

48 

387 

48 

401 

48 

416 

305 

48 

430 

48 

444 

48 

458 

48 

473 

48 

487 

48 

501 

48 

515 

48 

530 

48 

644 

48 

558 

306 

48 

572 

48 

686 

48 

601 

48 

615 

48 

629 

48 

643 

48 

657 

48 

671 

48 

686 

48 

700 

307 

48 

714 

48 

728 

48 

742 

48 

756 

48 

770 

48 

785 

48 

799 

48 

813 

48 

827 

48 

841 

308 

48 

855 

48 

869 

48 

883 

48 

897 

48 

911 

48 

926 

48 

940 

48 

954 

48 

968 

48 

982 

309 

48 

996 

49 

010 

49 

024 

49 

038 

49 

062 

49 

066 

49 

080 

49 

094 

49 

108 

49 

122 

810 

49 

136 

49 

150 

49 

164 

49 

178 

49 

192 

49 

206 

49 

220 

49 

234 

49 

248 

49 

262 

311 

49 

276 

49 

290 

49 

304 

49 

318 

49 

332 

49 

846 

49 

360 

49 

374 

49 

388 

49 

402 

312 

49 

415 

49 

429 

49 

443 

49 

457 

49 

471 

49 

485 

49 

499 

49 

613 

49 

627 

49 

641 

313 

49 

554 

49 

668 

49 

682 

49 

696 

49 

610 

49 

624 

49 

638 

49 

661 

49 

665 

49 

679 

314 

49 

693 

49 

707 

49 

721 

49 

734 

49 

748 

49 

762 

49 

776 

49 

790 

49 

803 

49 

817 

315 

49 

831 

49 

845 

49 

859 

49 

872 

49 

886 

49 

900 

49 

914 

49 

927 

49 

941 

49 

96S 

316 

49 

969 

49 

982 

49 

996 

SO 

010 

60 

024 

60 

037 

60 

051 

60 

065 

60 

079 

60 

092 

317 

50 

106 

60 

120 

50 

133 

50 

147 

60 

161 

50 

174 

60 

188 

60 

202 

60 

215 

50 

229 

318 

50 

243 

50 

256 

50 

270 

60 

284 

50 

297 

60 

311 

60 

325 

60 

338 

50 

352 

60 

365 

319 

50 

379 

SO 

393 

50 

406 

50 

420 

60 

433 

50 

447 

60 

461 

50 

474 

60 

488 

60 

601 

820 

50 

515 

50 

529 

50 

542 

60 

556 

50 

569 

50 

583 

50 

596 

50 

610 

50 

623 

50 

637 

321 

50 

651 

50 

664 

SO 

678 

50 

691 

50 

705 

50 

718 

50 

732 

50 

745 

50 

759 

50 

772 

322 

50 

786 

50 

799 

50 

813 

50 

820 

50 

840 

SO 

853 

50 

866 

50 

880 

60 

893 

50 

907 

323 

50 

920 

50 

934 

50 

947 

50 

961 

60 

974 

50 

987 

51 

001 

51 

014 

51 

028 

51 

041 

324 

51 

05a 

51 

068 

51 

081 

51 

09a 

61 

108 

51 

121 

61 

13a 

51 

148 

51 

162 

61 

175 

325 

51 

188 

51 

202 

51 

215 

51 

228 

51 

242 

51 

255 

61 

268 

51 

282 

51 

295 

51 

308 

326 

51 

322 

51 

335 

51 

348 

51 

362 

61 

375 

51 

888 

51 

402 

51 

41§ 

51 

428 

51 

441 

327 

51 

455 

51 

468 

61 

481 

51 

495 

51 

608 

51 

521 

51 

534 

51 

648 

61 

561 

51 

574 

828 

51 

587 

51 

601 

51 

614 

51 

627 

51 

640 

51 

654 

51 

667 

51 

680 

51 

693 

61 

706 

329 

51 

720 

51 

733 

51 

746 

51 

759 

51 

772 

61 

786 

51 

799 

51 

812 

51 

825 

51 

838 

830 

51 

851 

51 

sea 

51 

878 

51 

891 

51 

904 

51 

917 

51 

930 

51 

943 

51 

957 

51 

970 

331 

51 

983 

51 

996 

52 

009 

62 

022 

52 

035 

52 

048 

52 

061 

52 

075 

52 

088 

52 

101 

332 

52 

114 

52 

127 

52 

140 

52 

153 

52 

166 

52 

179 

52 

192 

52 

205 

62 

218 

52 

231 

333 

52 

244 

52 

257 

52 

270 

52 

284 

52 

297 

52 

310 

52 

323 

62 

336 

52 

349 

52 

362 

334 

52 

375 

52 

388 

52 

401 

52 

414 

52 

427 

52 

440 

52 

453 

52 

466 

52 

479 

52 

492 

335 

52 

504 

52 

517 

52 

530 

52 

543 

52 

556 

52 

569 

62 

582 

52 

595 

52 

608 

52 

621 

336 

52 

634 

52 

647 

52 

660 

52 

673 

62 

686 

52 

699 

52 

711 

52 

724 

52 

737 

52 

750 

337 

52 

763 

52 

776 

62 

789 

52 

802 

52 

815 

52 

827 

52 

840 

62 

853 

52 

866 

52 

879 

338 

52 

892 

52 

905 

52 

917 

52 

930 

62 

943 

52 

956 

52 

969 

52 

982 

52 

994 

53 

007 

339 

53 

020 

53 

033 

53 

046 

53 

058 

53 

071 

53 

084 

53 

097 

53 

110 

53 

122 

53 

135 

840 

53 

148 

53 

161 

53 

173 

53 

186 

53 

199 

53 

212 

53 

224 

53 

237 

63 

250 

53 

263 

341 

53 

275 

53 

288 

53 

301 

53 

314 

53 

326 

53 

339 

53 

352 

53 

364 

53 

377 

53 

390 

342 

53 

403 

53 

415 

53 

428 

53 

441 

53 

453 

53 

466 

53 

479 

53 

491 

53 

504 

53 

517 

343 

53 

529 

53 

542 

53 

555 

53 

567 

53 

580 

53 

593 

53 

605 

53 

618 

53 

631 

53 

643 

344 

53 

656 

53 

668 

53 

681 

53 

694 

53 

706 

53 

719 

53 

732 

53 

744 

53 

757 

53 

769 

345 

53 

782 

53 

794 

53 

807 

53 

820 

53 

832 

53 

845 

53 

857 

53 

870 

53 

882 

53 

895 

346 

53 

908 

53 

920 

53 

933 

53 

945 

53 

958 

53 

970 

53 

983 

53 

995 

54 

008 

54 

020 

347 

54 

033 

54 

045 

54 

058 

54 

070 

54 

083 

54 

095 

64 

108 

54 

120 

54 

133 

54 

145 

348 

64 

158 

54 

170 

54 

183 

54 

195 

54 

208 

54 

220 

64 

233 

54 

245 

54 

258 

54 

270 

349 

54 

283 

54 

295 

54 

307 

54 

320 

54 

332 

54 

345 

64 

357 

54 

370 

54 

382 

54 

394 

m 


0 


1 

2 

3 


4 


5 


6 


7 


8 


9 


300-349 















330 


ELEMENTARY SOCIAL STATISTICS 


Table 7- — Five-place Common Logaeithms op Numbbes. — (Continued) 

360-399 


No 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 

350 

54 

407 

54 

419 

54 

432 

64 

444 

54 

456 

54 

469 

64 

481 

54 

494 

54 

506 

64 

518 

351 

54 

531 

54 

543 

54 

555 

54 

568 

54 

580 

54 

593 

54 

605 

54 

617 

54 

630 

54 

642 

352 

54 

654 

64 

667 

54 

679 

54 

691 

64 

704 

54 

716 

54 

728 

54 

741 

54 

753 

54 

765 

353 

54 

777 

54 

790 

64 

802 

54 

814 

54 

827 

54 

839 

54 

851 

64 

864 

54 

876 

54 

888 

354 

54 

900 

64 

913 

64 

925 

54 

937 

54 

949 

54 

962 

64 

974 

64 

986 

64 

998 

55 

oil 

355 

55 

023 

55 

035 

65 

047 

55 

060 

55 

072 

65 

084 

65 

096 

65 

108 

55 

121 

65 

133 

356 

55 

14S 

65 

157 

55 

169 

55 

182 

65 

194 

55 

206 

55 

218 

55 

230 

65 

242 

56 

255 

357 

55 

267 

65 

279 

65 

291 

55 

303 

55 

316 

55 

328 

55 

340 

55 

352 

55 

364 

55 

376 

358 

55 

388 

65 

400 

65 

413 

55 

425 

65 

437 

55 

449 

65 

461 

65 

473 

65 

485 

55 

497 

359 

55 

509 

65 

522^ 

55 

534 

55 

546 

55 

558 

55 

570 

55 

582 

55 

594 

55 

606 

55 

618 

360 

55 

630 

55 

642 

65 

654 

65 

666 

55 

678 

55 

691 

55 

703 

65 

715 

65 

727 

65 

739 

361 

55 

751 

55 

763 

55 

77^ 

55 

787 

55 

799 

55 

811 

55 

823 

65 

83§ 

55 

847 

65 

859 

362 

55 

871 

55 

883 

55 

895 

55 

907 

55 

919 

55 

931 

65 

943 

65 

955 

65 

967 

55 

979 

363 

55 

991 

56 

003 

56 

015 

56 

027 

56 

038 

56 

050 

56 

062 

56 

074 

66 

086 

56 

098 

364 

56 

110 

56 

122 

66 

134 

56 

146 

66 

158 

56 

170 

66 

182 

56 

194 

56 

205 

56 

217 

365 

56 

229 

56 

241 

56 

253 

56 

265 

66 

277 

56 

289 

66 

301 

56 

312 

66 

324 

66 

336 

366 

56 

348 

56 

360 

66 

372 

56 

384 

56 

396 

56 

407 

56 

419 

66 

431 

56 

443 

56 

455 

367 

56 

467 

56 

478 

56 

490 

56 

502 

56 

514 

56 

626 

56 

538 

56 

549 

56 

561 

56 

573 

368 

56 

585 

56 

697 

56 

608 

56 

620 

56 

632 

56 

644 

56 

656 

56 

667 

56 

679 

56 

691 

369 

56 

703 

56 

714 

56 

726 

56 

738 

56 

750 

56 

761 

66 

773 

66 

785 

56 

797 

56 

808 

'' 370 

56 

820 

56 

832 

56 

844 

56 

855 

56 

867 

56 

879 

56 

891 

56 

902 

56 

914 

56 

926 

371 

56 

937 

56 

949 

56 

961 

56 

972 

56 

984 

56 

996 

57 

008 

57 

019 

57 

031 

67 

043 

372 

57 

054 

67 

066 

57 

078 

57 

089 

57 

101 

57 

113 

57 

124 

57 

136 

57 

148 

67 

159 

373 

57 

171 

57 

183 

57 

194 

57 

206 

57 

217 

57 

229 

67 

241 

67 

252 

67 

264 

67 

276 

374 

57 

287 

57 

299 

67 

310 

57 

322 

67 

334 

57 

345 

57 

357 

67 

368 

67 

380 

67 

392 

375 

57 

403 

67 

41S 

67 

426 

57 

438 

67 

449 

67 

461 

57 

473 

67 

484 

57 

496 

67 

507 

376 

57 

519 

57 

530 

57 

542 

57 

553 

67 

56§ 

67 

676 

57 

588 

67 

600 

67 

611 

57 

623 

377 

57 

634 

67 

646 

57 

657 

57 

669 

67 

680 

57 

692 

67 

703 

57 

715 

67 

726 

57 

738 

378 

57 

749 

67 

761 

57 

772 

57 

784 

67 

795 

57 

807 

67 

818 

67 

830 

67 

S41 

67 

852 

379 

57 

804 

67 

875 

57 

887 

67 

898 

67 

910 

67 

921 

67 

933 

67 

944 

67 

955 

57 

967 

380 

57 

978 

67 

990 

68 

001 

68 

013 

68 

024 

68 

035 

58 

047 

58 

058 

58 

070 

68 

081 

381 

58 

092 

58 

104 

68 

115 

68 

127 

68 

138 

58 

149 

58 

161 

68 

172 

68 

184 

68 

195 

382 

58 

206 

68 

218 

68 

229 

58 

240 

58 

252 

58 

263 

58 

274 

68 

286 

68 

297 

58 

309 

383 

58 

320 

58 

331 

68 

343 

58 

354 

58 

365 

68 

377 

58 

388 

58 

399 

68 

410 

68 

422 

384 

58 

433 

58 

444 

68 

456 

68 

467 

68 

478 

58 

490 

58 

501 

68 

612 

68 

524 

68 

535 

385 

58 

546 

68 

557 

68 

569 

58 

580 

68 

591 

58 

602 

68 

614 

68 

625 

68 

636 

58 

647 

386 

58 

659 

68 

670 

58 

681 

58 

692 

68 

704 

68 

715 

58 

726 

58 

737 

68 

749 

58 

760 

387 

58 

771 

68 

782 

58 

794 

68 

805 

58 

816 

58 

827 

68 

838 

68 

850 

68 

861 

68 

872 

388 

58 

883 

68 

894 

58 

906 

68 

917 

68 

928 1 

68 

939 

68 

950 

68 

961 

68 

973 

58 

984 

389 

58 

995 

69 

006 

69 

017 

59 

028 

59 

040 I 

59 

051 

59 

062 

69 

073 

69 

084 

69 

095 

390 

59 

106 

59 

118 

69 

129 

69 

140 

69 

151 

69 

162 

69 

173 

69 

184 

69 

195 

69 

207 

391 

59 

218 

69 

229 

69 

240 

69 

251 

69 

262 

59 

273 

69 

284 

69 

295 

69 

306 

59 

318 

392 

59 

329 

69 

340 

69 

351 

69 

362 

69 

373 

59 

384 

69 

395 

69 

406 

69 

417 

59 

428 

393 

59 

439 

59 

450 

69 

461 

69 

472 

59 

483 

59 

494 

69 

606 

69 

517 

59 

628 

59 

539 

394 

59 

550 

69 

561 

69 

572 

69 

583 

59 

694 

69 

605 

59 

616 

69 

627 

59 

638 

59 

649 

"395 

69 

660 

69 

671 

69 

682 

69 

693 

69 

704 

59 

715 

69 

726 

69 

737 

59 

748 

59 

759 

396 

59 

770 

59 

780 

69 

791 

69 

802 

59 

813 

59 

824 

69 

835 

59 

846 

59 

857 

59 

868 

397 

59 

879 

59 

890 

59 

901 

69 

912 

59 

923 

69 

934 

69 

945 

59 

956 

59 

966 

69 

977 

398 

59 

988 

59 

999 

60 

010 

60 

021 

60 

032 

60 

043 

60 

054 

60 

065 

60 

076 

60 

086 

399 

60 

097 

60 

108 

60 

119 

60 

130 

60 

141 

60 

152 

60 

163 

60 

173 

60 

184 

60 

195 

No 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


350-399 
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Table 7 — Five-place Common Logarithms as Ntimbbss — {Conitnued) 

400-449 


No 


0 

1 

2 

3 

4 

5 


6 

7 

8 

9 

400 

60 

206 

60 217 

60 228 

60 239 

60 249 

60 260 

60 

271 

60 282 

60 293 

60 304 

401 

60 

314 

60 325 

60 336 

60 347 

60 358 

60 369 

60 

379 

60 390 

60 401 

80 412 

402 

60 

423 

60 433 

60 444 

60 455 

60 466 

60 477 

60 

487 

60 498 

60 509 

60 520 

403 

60 

531 

60 541 

60 552 

60 663 

60 574 

60 584 

60 

595 

60 606 

60 617 

60 627 

404 

60 638 

60 649 

60 660 

60 670 

60 681 

60 692 

60 

703 

60 713 

60 724 

60 73§ 

405 

60 

746 

60 756 

60 767 

60 778 

60 788 

60 799 

60 

810 

60 821 

60 831 

60 842 

406 

60 

853 

60 863 

60 874 

60 885 

60 895 

60 906 

60 

917 

60 927 

60 938 

60 919 

407 

60 

959 

60 970 

60 981 

60 991 

61 002 

61 013 

61 

023 

61 034 

61 045 

61 055 

408 

61 

066 

61 077 

61 087 

61 098 

61 109 

61 119 

61 

130 

61 140 

61 151 

61 162 

409 

61 

172 

61 183 

61 194 

61 204 

61 215 

61 223 

61 236 

61 247 

61 257 

61 288 

410 

61 

278 

61 289 

61 300 

61 310 

61 321 

61 331 

61 

342 

61 352 

61 363 

61 374 

411 

61 

384 

61 395 

61 405 

61 416 

61 426 

61 437 

61 

448 

61 458 

61 469 

61 479 

412 

61 

490 

61 500 

61 511 

61 521 

61 532 

61 542 

61 

553 

61 663 

61 574 

61 584 

413 

61 

595 

61 606 

61 616 

61 627 

61 637 

61 648 

61 

658 

61 669 

61 679 

61 690 

414 

61 

700 

61 711 

61 721 

61 731 

61 742 

61 752 

61 

763 

61 773 

61 784 

61 794 

415 

61 80^ 

61 815 

61 826 

61 836 

61 847 

61 857 

61 

868 

61 878 

61 888 

61 899 

416 

61 

909 

61 920 

61 930 

61 941 

61 951 

61 962 

61 

972 

61 982 

61 993 

62 003 

417 

62 

014 

62 024 

62 034 

62 045 

62 055 

62 066 

62 

076 

62 086 

62 097 

62 107 

418 

62 

118 

62 128 

62 138 

62 149 

62 159 

62 170 

62 

180 

62 190 

62 201 

62 211 

419 

62 

221 

62 232 

62 242 

62 252 

62 263 

62 273 

62 

284 

62 294 

62 304 

62 315 

420 

62 

32^ 

62 335 

62 346 

62 356 

62 366 

62 377 

62 

387 

62 397 

62 408 

62 418 

421 

62 

428 

62 439 

62 449 

62 459 

62 469 

62 480 

62 

490 

62 500 

62 511 

62 521 

422 

62 

531 

62 542 

62 552 

62 562 

62 572 

62 583 

62 

593 

62 603 

62 613 

62 624 

423 

62 

634 

62 644 

62 655 

62 665 

62 675 

62 685 

62 

696 

62 706 

62 716 

62 726 

424 

62 

737 

62 747 

62 757 

62 767 

62 778 

62 788 

62 798 

62 808 

62 818 

62 829 

425 

62 839 

62 849 

62 859 

62 870 

62 880 

62 890 

62 

900 

62 910 

62 921 

62 931 

426 

62 941 

62 951 

62 961 

62 972 

62 982 

62 992 

63 

002 

63 012 

63 022 

63 033 

427 

63 043 

63 053 

63 063 

63 073 

63 083 

63 094 

63 

104 

63 114 

63 124 

63 134 

428 

63 

144 

63 155 

63 165 

63 175 

63 185 

63 195 

63 

205 

63 215 

63 225 

63 236 

429 

63 

246 

63 256 

63 266 

63 276 

63 286 

63 296 

63 

306 

63 317 

63 327 

63 337 

430 

63 

347 

63 357 

63 367 

63 377 

63 387 

63 397 

63 

407 

63 417 

63 428 

63 438 

431 

63 

448 

63 458 

63 468 

63 478 

63 488 

63 498 

63 

508 

63 518 

63 528 

63 538 

432 

63 

548 

63 558 

63 568 

63 579 

63 589 

63 599 

63 

C09 

63 619 

63 629 

63 639 

433 

63 

649 

63 659 

63 669 

63 679 

63 689 

63 699 

63 

709 

63 719 

63 729 

63 739 

434 

63 

749 

63 759 

63 769 

63 779 

63 789 

63 799 

63 

809 

63 819 

63 829 

63 839 

435 

63 849 

63 859 

63 869 

63 879 

63 889 

63 899 

63 

909 

63 919 

63 929 

63 939 

436 

63 

949 

63 959 

63 969 

63 979 

63 988 

63 998 

64 

008 

64 018 

64 028 

64 038 

437 

64 

048 

64 058 

64 068 

64 078 

64 088 

64 098 

64 

108 

64 118 

64 128 

64 137 

438 

64 

147 

64 157 

64 167 

64 177 

64 187 

64 197 

64 207 

64 217 

64 227 

64 237 

439 

64 

246 

64 256 

64 266 

64 276 

64 286 

64 296 

64 306 

64 316 

64 326 

64 335 

440 

64 

345 

64 355 

64 365 

64 375 

64 385 

64 395 

64 404 

64 414 

64 424 

64 434 

441 

64 

444 

64 454 

64 464 

64 473 

64 483 

64 493 

64 503 

64 513 

64 523 

64 532 

442 

64 

542 

64 552 

64 562 

64 572 

64 582 

64 591 

64 

601 

64 611 

64 621 

64 631 

443 

64 

640 

64 650 

64 660 

64 670 

64 680 

64 689 

64 699 

64 709 

64 719 

64 729 

444 

64 

738 

64 748 

64 758 

64 768 

64 777 

64 787 

64 797 

64 807 

64 816 

64 826 

445 

64 

836 

64 846 

64 856 

64 863 

64 875 

64 885 

64 895 

64 904 

64 914 

64 924 

446 

64 

933 

64 943 

64 953 

64 963 

64 972 

64 982 

64 992 

65 002 

C5 Oil 

65 021 

447 

65 

031 

65 040 

65 050 

65 060 

65 070 

65 079 

65 

089 

65 099 

65 108 

65 118 

448 

65 

128 

65 137 

65 147 

65 157 

65 167 

65 176 

65 

186 

65 196 

65 205 

65 215 

449 

65 

225 

65 234 

65 244 

65 254 

65 263 

65 273 

65 283 

65 292 

65 302 

GC 312 

No 


0 

1 

2 

3 

4 

5 


6 

7 

8 

9 


400-449 
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Table 7. — Five-place Common Logabithms op Ntbibeks. — {Continued) 

450-499 


No. 

0 

1 

2 

3 

4 

6 

6 

7 

8 

9 

450 

65 321 

65 331 

65 341 

65 350 

65 360 

65 369 

65 379 

65 389 

65 398 

65 408 

451 

65 418 

65 427 

65 437 

65 447 

65 456 

65 466 

65 475 

65 485 

65 49S 

65 504 

452 

65 514 

65 523 

65 533 

65 543 

65 552 

65 562 

65 571 

65 581 

65 591 

65 600 

453 

65 610 

65 619 

65 629 

65 639 

65 648 

65 658 

63 667 

65 677 

65 686 

63 696 

454 

65 706 

63 713 

65 725 

63 734 

65 744 

65 753 

65 763 

65 772 

65 782 

63 792 

455 

65 801 

63 811 

65 820 

65 830 

65 839 

65 849 

65 858 

63 868 

65 877 

65 887 

456 

65 896 

65 906 

65 916 

65 925 

65 93S 

65 944 

65 954 

65 963 

65 973 

65 982 

457 

65 992 

66 001 

66 011 

66 020 

66 030 

66 039 

66 049 

66 058 

66 068 

66 077 

458 

66 087 

66 096 

66 106 

66 US 

66 124 

66 134 

66 143 

66 153 

66 162 

66 172 

459 

66 181 

66 191 

66 200 

66 210 

66 219 

66 229 

66 238 

66 247 

66 257 

66 266 

460 

66 276 

66 283 

66 292 

66 304 

66 314 

66 323 

66 332 

66 342 

66 351 

66 361 

461 

66 370 

66 380 

66 389 

66 398 

66 408 

66 417 

66 427 

66 436 

66 445 

66 45S 

462 

66 464 

66 474 

66 483 

66 492 

66 502 

66 511 

66 521 

66 530 

66 539 

66 549 

463 

66 558 

66 567 

66 577 

66 586 

66 596 

66 605 

66 614 

66 624 

66 633 

66 642 

464 

66 652 

66 661 

66 671 

66 680 

66 689 

66 699 

66 708 

66 717 

66 727 

66 736 

465 

66 745 

66 755 

66 764 

66 773 

66 783 

66 792 

66 801 

66 811 

66 820 

66 829 

466 

66 839 

66 848 

66 857 

66 867 

66 876 

66 885 

66 894 

66 904 

66 913 

66 922 

467 

66 932 

66 941 

66 950 

66 960 

66 969 

66 978 

66 987 

66 997 

67 006 

67 016 

468 

67 025 

67 034 

67 043 

67 052 

67 062 

67 071 

67 080 

67 089 

67 099 

67 108 

469 

67 117 

67 127 

67 136 

67 143 

67 154 

67 164 

67 173 

67 182 

67 191 

67 201 

470 

67 210 

67 219 

67 228 

67 237 

67 247 

67 256 

67 263 

67 274 

67 284 

67 293 

471 

67 302 

67 311 

67 321 

67 330 

67 339 

67 348 

67 357 

67 367 

67 376 

67 385 

472 

67 394 

67 403 

67 413 

67 422 

67 431 

67 440 

67 449 

67 459 

67 468 

67 477 

473 

67 486 

67 493 

67 504 

67 514 

67 523 

67 532 

67 541 

67 550 

67 560 

67 569 

474 

67 578 

67 587 

67 596 

67 605 

67 614 

67 624 

67 633 

67 642 

67 651 

67 660 

475 

67 669 

67 679 

67 688 

67 697 

67 706 

67 715 

67 724 

67 733 

67 742 

67 752 

476 

67 761 

67 770 

67 779 

67 788 

67 797 

67 806 

67 815 

67 82§ 

67 834 

67 843 

477 

67 852 

67 861 

67 870 

67 879 

67 888 

67 897 

67 906 

67 916 

67 925 

67 934 

478 

67 943 

67 952 

67 961 

67 970 

67 979 

67 988 

67 997 

68 006 

68 015 

68 024 

479 

68 034 

68 043 

68 052 

68 061 

68 070 

68 079 

68 088 

68 097 

68 106 

68 115 

480 

68 124 

68 133 

68 142 

68 151 

68 160 

68 169 

68 178 

68 187 

68 196 

68 205 

481 ' 

68 21§ 

68 224 

68 233 

68 242 

68 251 

68 260 

68 269 

68 278 

68 287 

68 296 

482 

68 30 § 

68 314 

68 323 

68 332 

68 341 

68 350 

68 359 

68 368 

68 377 

68 386 

483 

68 395 

68 404 

68 413 

68 422 

68 431 

68 440 

68 449 

68 468 

68 467 

68 476 

484 

68 485 

68 494 

68 502 

68 511 

68 520 

68 529 

68 538 

68 647 

68 556 

68 565 

485 

68 574 

68 583 

68 592 

68 601 

68 610 

68 619 

68 628 

68 637 

68 646 

68 65S 

486 

68 664 

68 673 

68 681 

68 690 

68 699 

68 708 

68 717 

68 726 

68 735 

68 744 

487 

68 753 

68 762 

68 771 

68 780 

68 789 

68 797 

68 806 

68 813 

68 824 

68 833 

488 

68 842 

68 851 

68 860 

68 869 

68 878 

68 886 

68 895 

68 904 

68 913 

68 922 

489 

68 931 

68 940 

68 949 

68 958 

68 966 

68 973 

68 984 

68 993 

69 002 

69 oil 

490 

69 020 

69 028 

69 037 

69 046 

69 055 

69 064 

69 073 

69 082 

69 090 

69 099 

491 

69 108 

69 117 

69 126 

69 13S 

69 144 

69 152 

69 161 

69 170 

69 179 

69 188 

492 

69 197 

69 205 

69 214 

69 223 

69 232 

69 241 

69 249 

69 258 

69 267 

69 276 

493 

69 285 

69 294 

69 302 

69 311 

69 320 

69 329 

69 338 

69 346 

69 355 

69 364 

494 

69 373 

69 381 

69 390 

69 399 

69 408 

69 417 

69 423 

69 434 

69 443 

69 452 

496 

69 461 

69 469 

69 478 

69 487 

69 496 

69 504 

69 513 

69 522 

69 531 

69 539 

496 

69 648 

69 557 

69 566 

69 574 

69 583 

69 592 

69 601 

69 609 

69 618 

69 627 

497 

69 636 

69 644 

69 653 

69 662 

69 671 

69 679 

69 688 

69 697 

69 705 

69 714 

498 

69 723 

69 732 

69 740 

69 749 

69 758 

69 767 

69 775 

69 784 

69 793 

69 801 

> 499 

69 810 

69 819 

69 827 

69 836 

69 84S 

69 854 

69 862 

69 871 

69 880 

69 888 

No. 

0 

1 

2 

3 

4 

3 

6 

7 

8 

9 


460-499 
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Table 7 — Five-place Common Logaeithms op Notibees — (ConUntted) 

600-649 



0 


1 


2 


3 


4 

69 

897 

69 

906 

69 

914 

69 

923 

69 

932 

69 

984 

69 

992 

70 

001 

70 

010 

70 

018 

70 

070 

70 

079 

70 

088 

70 

096 

70 

105 

70 

157 

70 

165 

70 

174 

70 

183 

70 

191 

70 

243 

70 

262 

70 

260 

70 

269 

70 

278 

70 

329 

70 

338 

70 

346 

70 

355 

70 

364 

70 

415 

70 

424 

70 

432 

70 

441 

70 

449 

70 

501 

70 

509 

70 

518 

70 

626 

70 

536 

70 

586 

70 

69§ 

70 

603 

70 

612 

70 

621 

70 

672 

70 

680 

70 

689 

70 

697 

70 

706 

70 

767 

70 

766 

70 

774 

70 

783 

70 

791 

70 

842 

70 

851 

70 

859 

70 

868 

70 

876 

70 

927 

70 

935 

70 

944 

70 

952 

70 

961 

71 

012 

71 

020 

71 

029 

71 

037 

71 

046 

71 

096 

71 

105 

71 

113 

71 

122 

71 

130 

71 

181 

71 

189 

71 

198 

71 

206 

71 

214 

71 

26g 

71 

273 

71 

282 

71 

290 

71 

299 

71 

349 

71 

357 

71 

366 

71 

374 

71 

383 

71 

433 

71 

441 

71 

450 

71 

468 

71 

466 

71 

617 

71 

525 

71 

533 

71 

642 

71 

550 

71 

600 

71 

609 

71 

617 

71 

625 

71 

634 

71 

684 

71 

692 

71 

700 

71 

709 

71 

717 

71 

767 

71 

775 

71 

784 

71 

792 

71 

800 

71 

850 

71 

858 

71 

867 

71 

876 

71 

883 

71 

933 

71 

941 

71 

950 

71 

958 

71 

966 

72 

016 

72 

024 

72 

032 

72 

041 

72 

049 

72 

099 

72 

107 

72 

115 

72 

123 

72 

132 

72 

181 

72 

189 

72 

198 

72 

206 

72 

214 

72 

263 

72 

272 

72 

280 

72 

288 

72 

296 

72 

346 

72 

354 

72 

362 

72 

370 

72 

378 

72 

428 

72 

436 

72 

444 

72 

462 

72 

460 

72 

609 

72 

518 

72 

626 

72 

634 

72 

542 

72 

691 

72 

699 

72 

607 

72 

616 

72 

624 

72 

673 

72 

681 

72 

689 

72 

697 

72 

705 

72 

754 

72 

762 

72 

770 

72 

779 

72 

787 

72 

835 

72 

843 

72 

862 

72 

860 

72 

868 

72 

916 

72 

925 

72 

933 

72 

941 

72 

949 

72 

997 

73 

006 

73 

014 

73 

022 

73 

030 

73 

078 

73 

086 

73 

094 

73 

102 

73 

111 

73 

159 

73 

167 

73 

175 

73 

183 

73 

191 

73 

239 

73 

247 

73 

255 

73 

263 

73 

272 

73 

320 

73 

328 

73 

336 

73 

344 

73 

352 

73 

400 

73 

408 

73 

416 

73 

424 

73 

432 

73 

480 

73 

488 

73 

496 

73 

504 

73 

512 

73 

560 

73 

568 

73 

576 

73 

584 

73 

592 

73 

640 

73 

648 

73 

656 

73 

664 

73 

672 

73 

719 

73 

727 

73 

735 

73 

743 

73 

751 

73 

799 

73 

807 

73 

815 

73 

823 

73 

830 

73 

878 

73 

886 

73 

894 

73 

902 

73 

910 

73 

957 

73 

965 

73 

973 

73 

981 

73 

989 


i 

5 

6 


7 

8 


9 

69 

940 

69 

949 

69 

958 

69 

966 

69 

975 

70 

027 

70 

036 

70 

044 

70 

053 

70 

062 

70 

114 

70 

122 

70 

131 

70 

140 

70 

148 

70 

200 

70 

209 

70 

217 

70 

226 

70 

234 

70 

286 

70 

295 

70 

303 

70 

312 

70 

321 

70 

372 

70 

381 

70 

389 

70 

398 

70 

406 

70 

458 

70 

467 

70 

476 

70 

484 

70 

492 

70 

544 

70 

652 

70 

661 

70 

569 

70 

578 

70 

629 

70 

638 

70 

646 

70 

655 

70 

663 

70 

714 

70 

723 

70 

731 

70 

740 

70 

749 

70 

800 

70 

808 

70 

817 

70 

825 

70 

834 

70 

885 

70 

893 

70 

902 

70 

910 

70 

919 

70 

969 

70 

978 

70 

986 

70 

995 

71 

003 

71 

054 

71 

063 

71 

071 

71 

079 

71 

088 

71 

139 

71 

147 

71 

156 

71 

164 

71 

172 

71 

223 

71 

231 

71 

240 

71 

248 

71 

257 

71 

307 

71 

315 

71 

324 

71 

332 

71 

341 

71 

391 

71 

399 

71 

408 

71 

416 

71 

425 

71 

475 

71 

483 

71 

492 

71 

§00 

71 

508 

71 

569 

71 

567 

71 

575 

71 

584 

71 

502 

71 

642 

71 

650 

71 

659 

71 

667 

71 

675 

71 

725 

71 

734 

71 

742 

71 

750 

71 

759 

71 

809 

71 

8X7 

71 

825 

71 

834 

71 

842 

71 

892 

71 

900 

71 

908 

71 

917 

71 

925 

71 

975 

71 

983 

71 

991 

71 

999 

72 

008 

72 

067 

72 

066 

72 

074 

72 

082 

72 

090 

72 

140 

72 

148 

72 

166 

72 

165 

72 

173 

72 

222 

72 

230 

72 

239 

72 

247 

72 

255 

72 

304 

72 

313 

72 

321 

72 

329 

72 

337 

72 

387 

72 

395 

72 

403 

72 

411 

72 

419 

72 

469 

72 

477 

72 

485 

72 

493 

72 

501 

72 

560 

72 

658 

72 

667 

72 

575 

72 

683 

72 

632 

72 

640 

72 

648 

72 

656 

72 

665 

72 

713 

72 

722 

72 

730 

72 

738 

72 

746 

72 

795 

72 

803 

72 

811 

73 

819 

72 

827 

72 

876 

72 

884 

72 

892 

72 

900 

72 

908 

72 

967 

72 

966 

72 

973 

72 

981 

72 

989 

73 

038 

73 

046 

73 

054 

73 

062 

73 

070 

73 

119 

73 

127 

73 

135 

73 

143 

73 

161 

73 

199 

73 

207 

73 

215 

73 

223 

73 

231 

73 

280 

73 

288 

73 

296 

73 

304 

73 

312 

73 

360 

73 

368 

73 

376 

73 

384 

73 

392 

73 

440 

73 

448 

73 

456 

73 

464 

73 

472 

73 

520 

73 

528 

73 

536 

73 

544 

73 

552 

73 

600 

73 

608 

73 

616 

73 

624 

73 

632 

73 

679 

73 

687 

73 

695 

73 

703 

73 

711 

73 

759 

73 

767 

73 

775 

73 

783 

73 

791 

73 

838 

73 

846 

73 

854 

73 

862 

73 

870 

73 

918 

73 

926 

73 

933 

73 

941 

73 

949 

73 

997 

74 

005 

74 

013 

74 

020 

74 

028 
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Table 7 — Five-place Common Logarithms op Nembees. — (Continued) 

660-699 


No 


0 


1 


2 


3 


4 


6 


6 


7 


8 

9 

650 

74 

036 

74 

044 

74 

052 

74 

060 

74 

068 

74 

076 

74 

084 

74 

092 

74 

099 

74 107 

651 

74 

115 

74 

123 

74 

131 

74 

139 

74 

147 

74 

155 

74 

162 

74 

170 

74 

178 

74 186 

552 

74 

194 

74 

202 

74 

210 

74 

218 

74 

225 

74 

233 

74 

241 

74 

249 

74 

257 

74 265 

553 

74 

273 

74 

280 

74 

288 

74 

296 

74 

304 

74 

312 

74 

320 

74 

327 

74 

335 

74 343 

654 

74 

351 

74 

359 

74 

367 

74 

374 

74 

382 

74 

390 

74 

398 

74 

406 74 

414 

74 421 

655 

74 

429 

74 

437 

74 

445 

74 

453 

74 

461 

74 

468 

74 

476 

74 

484 

74 

492 

74 500 

656 

74 

607 

74 

515 

74 

523 

74 

631 

74 

639 

74 

647 

74 

554 

74 

662 

74 

570 

74 578 

557 

74 

686 

74 

693 

74 

601 

74 

609 

74 

617 

74 

624 

74 

632 

74 

640 

74 

648 

74 656 

658 

74 

663 

74 

671 

74 

679 

74 

687 

74 

695 

74 

702 

74 

710 

74 

718 

74 

726 

74 733 

659 

74 

741 

74 

749 

74 

767 

74 

764 

74 

772 

74 

780 

74 

788 

74 

796 

74 

803 

74 811 

560 

74 

819 

74 

827 

74 

834 

74 

842 

74 

850 

74 

858 

74 

865 

74 

873 

74 

881 

74 889 

561 

74 

896 

74 

904 

74 

912 

74 

920 

74 

927 

74 

935 

74 

943 

74 

960 

74 

958 

74 966 

662 

74 

974 

74 

981 

74 

989 

74 

997 

76 

005 

75 

012 

75 

020 

75 

028 

75 

035 

75 043 

663 

75 

061 

76 

059 

75 

066 

75 

074 

76 

082 

75 

089 

76 

097 

75 

105 

75 

113 

75 120 

564 

76 

128 

75 

136 

75 

143 

75 

151 

75 

159 

75 

166 

75 

174 

75 

182 

75 

189 

75 197 

665 

75 

206 

75 

213 

75 

220 

76 

228 

75 

236 

75 

243 

76 

251 

75 

259 

75 

266 

75 274 

666 

75 

282 

75 

289 

76 

297 

75 

305 

76 

312 

75 

320 

75 

328 

75 

336 

75 

343 

75 351 

667 

76 

368 

75 

366 

75 

374 

75 

381 

76 

389 

76 

397 

75 

404 

75 

412 

75 

420 

75 427 

568 

75 

436 

76 

442 

75 

450 

75 

458 

75 

465 

75 

473 

76 

481 

75 

488 

75 

496 

75 504 

569 

76 

611 

75 

519 

75 

626 

75 

534 

75 

642 

75 

649 

76 

667 

75 

665 

75 

572 

75 580 

670 

75 

687 

76 

695 

75 

603 

75 

610 

75 

618 

75 

626 

75 

633 

76 

641 

75 

648 

75 656 


76 

664 

75 

671 

75 

679 

75 

686 

75 

694 

76 

702 

75 

709 

75 

717 

75 

724 

75 732 


75 

740 

76 

747 

75 

765 

76 

762 

76 

770 

75 

778 

75 

785 

75 

793 

75 

800 

75 808 


75 

815 

75 

823 

75 

831 

75 

838 

75 

846 

75 

863 

75 

861 

76 

868 

75 

876 

75 884 


76 

891 

75 

899 

75 

906 

76 

914 

75 

921 

75 

929 

75 

937 

76 

944 

75 

952 

75 959 


75 

967 

76 

974 

75 

982 

75 

989 

75 

997 

76 

005 

76 

012 

76 

020 

76 

027 

76 035 


76 

042 

76 

050 

76 

067 

76 

065 

76 

072 

76 

080 

76 

087 

76 

095 

76 

103 

76 110 


76 

118 

76 

126 

76 

133 

76 

140 

76 

148 

76 

165 

76 

163 

76 

170 

76 

178 

76 185 


76 

193 

76 

200 

76 

208 

76 

215 

76 

223 

76 

230 

76 

238 

76 

246 

76 

253 

76 260 

■^9 

76 

268 

76 

276 

76 

283 

76 

290 

76 

298 

76 

305 

76 

313 

76 

320 

76 

328 

76 335 


76 

343 

76 

350 

76 

368 

76 

365 

76 

373 

76 

380 

76 

388 

76 

395 

76 

403 

76 410 

681 

76 

418 

76 

425 

76 

433 

76 

440 

76 

448 

76 

455 

76 

462 

76 

470 

76 

477 

76 485 

682 

76 

492 

76 

§00 

76 

607 

76 

615 

76 

622 

76 

630 

76 

637 

76 

645 

76 

552 

76 559 

683 

76 

667 

76 

674 

76 

682 

76 

689 

76 

597 

1 76 

604 

76 

612 

76 

619 

76 

626 

76 634 

684 

76 

641 

76 

649 

76 

656 

76 

664 

76 

671 

! 76 

678 

76 

686 

76 

693 

76 

701 

76 708 

686 

76 

716 

76 

723 

76 

730 

76 

738 

76 

745 

76 

753 

76 

760 

76 

768 

76 

775 

76 782 

686 

76 

790 

76 

797 

76 

805 

76 

812 

76 

819 

76 

827 

76 

834 

76 

842 

76 

849 

76 856 

687 

76 

864 

76 

871 

76 

879 

76 

886 

76 

893 

76 

901 

76 

908 

76 

916 

76 

923 

76 930 

688 

76 

938 

76 

945 

76 

953 

76 

960 

76 

967 

76 

975 

76 

982 

76 

989 

76 

997 

77 004 

689 

77 

012 

77 

019 

77 

026 

77 

034 

77 

041 

77 

048 

77 

066 

77 

063 

77 

070 

77 078 

690 

77 

085 

77 

093 

77 

100 

77 

107 

77 

115 

77 

122 

77 

129 

77 

137 

77 

144 

77 151 

691 

77 

169 

77 

166 

77 

173 

77 

181 

77 

188 

77 

195 

77 

203 

77 

210 

77 

217 

77 225 

692 

77 

232 

77 

240 

77 

247 

77 

254 

77 

262 

77 

269 

77 

276 

77 

283 

77 

291 

77 298 

693 

77 

305 

77 

313 

77 

320 

77 

327 

77 

335 

77 

342 

77 

349 

77 

357 

77 

364 

77 371 

594 

77 

379 

77 

386 

77 

393 

77 

401 

77 

408 

77 

415 

77 

422 

77 

430 

77 

437 

77 444 

695 

77 

452 

77 

459 

77 

466 

77 

474 

77 

481 

77 

488 

77 

495 

77 

603 

77 

510 

77 517 

596 

77 

525 

77 

632 

77 

639 

77 

646 

77 

654 

77 

661 

77 

668 

77 

676 

77 

583 

77 590 

697 

77 

697 

77 

605 

77 

612 

77 

619 

77 

627 

77 

634 

77 

641 

77 

648 

77 

656 

77 663 

598 

77 

670 

77 

677 

77 

685 

77 

692 

77 

699 

77 

706 

77 

714 

77 

721 

77 

728 

77 735 

699 

77 

743 

77 

750 

77 

767 

77 

764 

77 

772 

77 

779 

77 

786 

77 

793 

77 

801 

77 808 

No 


0 


1 


2 


3 


4 


5 


6 


7 


8 

9 


560-699 
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Table 7. — ^Five-place Common Logabithms op Ntjmbbhs — {Continued) 

600-649 



600-649 
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Table 7. — 'TrvB-PLACE Common Logaeithms op Ntimbees — {Continued) 

660-699 


No. 


3 


1 

2 


3 


4 


5 


6 


7 

8 

9 

650 

81 

291 

81 

298 

81 

305 

81 

311 

81 

318 

81 

32S 

81 

331 

81 

338 

81 

34S 

81 

351 

651 

81 

358 

81 

36§ 

81 

371 

81 

378 

81 

385 

81 

391 

81 

398 

81 

405 

81 

411 

81 

418 

652 

81 

425 

81 

431 

81 

438 

81 

445 

81 

451 

81 

458 

81 

46S 

81 

471 

81 

478 

81 

43S 

653 

81 

491 

81 

498 

81 

505 

81 

511 

81 

518 

81 

525 

81 

531 

81 

538 

81 

544 

81 

551 

654 

81 

558 

81 

564 

81 

571 

81 

578 

81 

584 

81 

591 

81 

598 

81 

604 

81 

611 

81 

617 

655 

81 

624 

81 

631 

81 

637 

81 

644 

81 

651 

81 

657 

SI 

664 

81 

671 

81 

677 

81 

684 

656 

81 

690 

81 

697 

81 

704 

81 

710 

81 

717 

81 

723 

81 

730 

81 

737 

81 

743 

81 

750 

657 

81 

757 

81 

763 

81 

770 

81 

776 

81 

783 

81 

790 

81 

796 

81 

803 

81 

809 

81 

816 

658 

81 

823 

81 

829 

81 

836 

81 

842 

81 

849 

81 

856 

81 

862 

81 

869 

81 

875 

81 

882 

659 

81 

889 

81 

895 

81 

902 

81 

908 

81 

915 

81 

921 

81 

928 

81 

93S 

81 

941 

81 

948 

660 

81 

954 

81 

961 

81 

988 

81 

974 

81 

981 

81 

987 

81 

994 

82 

000 

82 

007 

82 

014 

661 

82 

020 

82 

027 

82 

033 

82 

040 

82 

046 

82 

053 

82 

060 

82 

066 

82 

073 

82 

079 

662 

82 

086 

82 

092 

82 

099 

82 

105 

82 

112 

82 

119 

82 

125 

82 

132 

82 

138 

82 

14S 

663 

82 

151 

82 

158 

82 

164 

82 

171 

82 

178 

82 

184 

82 

191 

82 

197 

82 

204 

82 

210 

664 

82 

217 

82 

223 

82 

230 

82 

236 

82 

243 

82 

249 

82 

256 

82 

263 

82 

269 

82 

276 

665 

82 

282 

82 

289 

82 

295 

82 

302 

82 

308 

82 

315 

82 

321 

82 

328 

82 

334 

82 

341 

666 

82 

347 

82 

354 

82 

360 

82 

367 

82 

373 

82 

380 

82 

387 

82 

393 

82 

400 

82 

406 

667 

82 

413 

82 

419 

82 

426 

82 

432 

82 

439 

82 

445 

82 

452 

82 

458 

82 

465 

82 

471 

668 

82 

478 

82 

484 

82 

491 

82 

497 

82 

604 

82 

510 

82 

517 

82 

523 

82 

530 

82 

536 

669 

82 

543 

82 

549 

82 

556 

82 

562 

82 

569 

82 

575 

82 

582 

82 

588 

82 

59S 

82^601 

670 

82 

607 

82 

614 

82 

620 

82 

627 

82 

633 

82 

640 

82 

646 

82 

653 

82 

659 

82 

666 

671 

82 

672 

82 

679 

82 

685 

82 

692 

82 

698 

82 

70S 

82 

711 

82 

718 

82 

724 

82 

730 

672 

82 

737 

82 

743 

82 

750 

82 

766 

82 

763 

82 

769 

82 

776 

82 

782 

82 

789 

82 

795 

673 

82 

802 

82 

808 

82 

814 

82 

821 

82 

827 

82 

834 

82 

840 

82 

847 

82 

853 

82 

860 

674 

82 

866 

82 

872 

82 

879 

82 

885 

82 

892 

82 

898 

82 

90S 

82 

911 

82 

918 

82 

924 

675 

82 

930 

82 

937 

82 

943 

82 

9§0 

82 

956 

82 

963 

82 

969 

82 

975 

82 

982 

82 

988 

676 

82 

995 

83 

001 

83 

008 

83 

014 

83 

020 

83 

027 

83 

033 

83 

040 

83 

046 

83 

052 

677 

83 

059 

83 

065 

83 

072 

83 

078 

83 

OSS 

83 

091 

83 

097 

83 

104 

83 

110 

83 

117 

678 

83 

123 

83 

129 

83 

136 

83 

142 

83 

149 

83 

15S 

83 

161 

83 

168 

83 

174 

83 

181 

679 

83 

187 

83 

193 

83 

200 

83 

206 

83 

213 

83 

219 

83 

225 

83 

232 

83 

238 

83 

24S 

680 

83 

251 

83 

257 

83 

264 

83 

270 

83 

276 

83 

283 

83 

289 

83 

296 

83 

302 

83 

308 

681 

83 

315 

83 

321 

83 

327 

83 

334 

83 

340 

S3 

347 

83 

353 

83 

359 

83 

366 

83 

372 

682 

83 

378 

83 

385 

83 

391 

83 

398 

83 

404 

83 

410 

83 

417 

83 

423 

83 

429 

83 

436 

683 

83 

442 

83 

448 

83 

45§ 

83 

461 

83 

467 

83 

474 

83 

480 

83 

487 

83 

493 

83 

499 

684 

83 

506 

83 

512 

83 

518 

83 

52g 

83 

631 

83 

537 

83 

544 

83 

550 

83 

556 

83 

563 

685 

83 

569 

83 

575 

83 

582 

83 

588 

83 

694 

83 

601 

83 

607 

83 

613 

83 

620 

83 

626 

686 

83 

632 

83 

639 

S3 

645 

83 

651 

83 

658 

83 

664 

83 

670 

83 

677 

83 

683 

83 

689 

687 

83 

696 

83 

702 

S3 

708 

83 

715 

83 

721 

83 

727 

83 

734 

83 

740 

83 

746 

83 

753 

688 

83 

759 

83 

765 

S3 

771 

83 

778 

83 

784 

83 

790 

83 

797 

83 

803 

83 

809 

83 

816 

689 

83 

822 

83 

828 

83 

83§ 

83 

841 

83 

847 

83 

853 

83 

860 

83 

866 

83 

872 

83 

879 

690 

83 

88S 

83 

891 

83 

897 

83 

904 

83 

910 

83 

916 

83 

923 

83 

929 

S3 

935 

83 

942 

691 

83 

948 

83 

954 

83 

960 

83 

967 

83 

973 

83 

979 

83 

985 

83 

992 

83 

998 

84 

004 

692 

84 

Oil 

84 

017 

84 

023 

84 

029 

84 

036 

84 

042 

84 

048 

84 

05S 

84 

061 

84 

067 

693 

84 

073 

84 

080 

84 

086 

84 

092 

84 

098 

84 

105 

84 

111 

84 

117 

84 

123 

84 

130 

694 

84 

136 

84 

142 

84 

148 

84 

16^ 

84 

161 

84 

167 

84 

173 

84 

180 

84 

186 

84 

192 

695 

84 

198 

84 

205 

84 

211 

84 

217 

84 

223 

84 

230 

84 

236 

84 

242 

84 

248 

84 

25S 

696 

84 

261 

84 

267 

84 

273 

84 

280 

84 

286 

84 

292 

84 

298 

84 

305 

84 

311 

84 

317 

697 

84 

323 

84 

330 

84 

336 

84 

342 

84 

348 

84 

354 

84 

361 

84 

367 

84 

373 

84 

379 

698 

84 

386 

84 

392 

84 

398 

84 

404 

84 

410 

84 

417 

84 

423 

84 

429 

84 

435 

84 

442 

699 

84 

448 

84 

454 

84 

460 

84 

466 

84 

473 

84 

479 

84 

48S 

84 

491 

84 

497 

84 

504 

No, 


0 


1 


2 


3 


4 


5 


6 


7 


8 

9 


660-699 
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Table 7 Five-place Common Logaeithms op Ntjmbees — 

700-749 



84 646 84 652 
84 708 84 714 
84 770 84 776 


86 333 86 338 86 344 86 350 86 356 

86 392 86 398 86 404 86 410 86 415 

86 451 86 457 86 463 86 469 86 475 

86 510 86 616 86 522 86 528 86 534 

86 670 86 676 86 581 86 687 86 593 

86 629 86 635 86 641 86 646 86 652 

86 688 86 694 86 700 86 705 86 711 

86 747 86 753 86 759 86 764 86 770 

86 806 86 812 86 817 86 823 86 829 

86 864 86 870 86 876 86 882 86 888 

86 923 86 929 86 935 86 941 86 947 

86 982 86 988 86 994 86 999 87 005 

87 040 87 046 87 052 87 058 87 064 

87 099 87 105 87 111 87 116 87 122 

87 167 87 163 87 169 87 175 87 181 

87 216 87 221 87 227 87 233 87 239 

87 274 87 280 87 286 87 291 87 297 

87 332 87 338 87 344 87 349 87 355 

87 390 87 396 87 402 87 408 87 413 

87 448 87 454 87 460 87 466 87 471 


87 245 87 251 87 256 87 262 87 268 

87 303 87 309 87 315 87 320 87 326 

87 361 87 367 87 373 87 379 87 384 

87 419 87 425 87 431 87 437 87 442 


87 379 87 384 
87 437 87 442 


87 466 87 471 87 477 87 483 87 489 87 495 87 500 



700-749 
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Table 7.‘ — Five-placb Common Logakithms op Nxtmbbrs. — (Continued) 

760-799 


No 

1 

0 


1 


2 

3 


4 


5 


6 


7 


8 

9 

760 

87 

506 

87 

612 

87 

518 

87 

523 

87 

529 

87 

635 

87 

641 

87 

647 

87 

552 

87 

658 

751 

87 

564 

87 

670 

87 

676 

87 

581 

87 

687 

87 

593 

87 

699 

87 

604 

87 

610 

87 

616 

752 

87 

622 

87 

628 

87 

633 

87 

639 

87 

645 

87 

651 

87 

656 

87 

662 

87 

668 

87 

674 i 

753 

87 

679 

87 

685 

87 

691 

87 

697 

87 

703 

87 

708 

87 

714 

87 

720 

87 

726 

87 

731 1 

754 

87 

737 

87 

743 

87 

749 

87 

754 

87 

760 

87 

766 

87 

772 

87 

777 

87 

783 

87 

789 ! 

755 

87 

795 

87 

800 

87 

806 

87 

812 

87 

818 

87 

823 

87 

829 

87 

835' 

87 

841 

87 

846 

756 

87 

852 

87 

858 

87 

864 

87 

869 

87 

875 

87 

881 

87 

887 

87 

892 

87 

898 

87 

904 

757 

87 

910 

87 

915 

87 

921 

87 

927 

87 

933 

87 

938 

87 

944 

87 

950 

87 

955 

87 

961 

758 

87 

967 

87 

973 

87 

978 

87 

984 

87 

990 

87 

996 

88 

001 

88 

007 

88 

013 

88 

018 i 

759 

88 

024 

88 

030 

88 

036 

88 

041 

88 

047 

88 

053 

88 

058 

88 

064. 

. 88 

070 

88 

076 

760 

88 

081 

88 

087 

88 

093 

88 

098 

88 

104 

88 

110 

88 

116 

88 

121- 

• ss' 

127 

88 

133 

761 

88 

138 

88 

144 

88 

150 

88 

156 

88 

161 

88 

167 

88 

173 

88 

178 

88 

184 

88 

190 

762 

88 

195 

88 

201 

88 

207 

88 

213 

88 

218 

88 

224 

88 

230 

88 

235 

88 

241 

88 

247 1 

763 

88 

252 

88 

258 

88 

264 

88 

270 

88 

275 

88 

281 

88 

287 

88 

292 

88 

298 

88 

304 

764 

88 

309 

88 

815 

88 

321 

88 

326 

88 

332 

88 

338 

88 

343 

88 

349 

88 

355 

88 

360 ! 

766 

88 

366 

88 

372 

88 

377 

88 

383 

88 

389 

88 

39g 

88 

400 

88 

406 

88 

412 

88 

417 

766 

88 

423 

88 

429 

88 

434 

88 

440 

88 

446 ’ 

88 

461 

88 

457 

88 

463 

88 

468 

88 

474 

767 

88 

480 

88 

485 

88 

491 

88 

497 

88 

602 

88 

608 

88 

513 

88 

519 

88 

525 

88 

630 

768 

88 

536 

^8 

542 

88 

547 

88 

553 

88 

659 

88 

564 

88 

670 

88 

576 

88 

581 

88 

587 1 

769 

88 

593 

88 

598 

88 

604 

88 

610 

88 

615 

88 

621 

88 

627 

88 

632 

88 

638 

8% 

643 

770 

88 

649 

88 

655 

88 

660 

88 

666 

88 

672 

88 

677 

88 

683 

88 

689 

88 

694 

88 

700 

771 

88 

705 

88 

711 

88 

717 

88 

722 

88 

728 

88 

734 

88 

739 

88 

745 

88 

750 

88 

756 

772 

88 

762 

88 

767 

88 

773 

88 

779 

88 

784 

88 

790 

88 

795 

88 

801 

88 

807 

88 

812 

773 

88 

818 

88 

824 

88 

829 

88 

835 

88 

840 

88 

846 

88 

852 

88 

857 

88 

863 

88 

868 

774 

88 

874 

88 

880 

88 

885 

88 

891 

88 

897 

88 

902 

88 

908 

88 

913 

88 

919 

88 

925 

775 

88 

930 

88 

936 

88 

941 

88 

947 

88 

953 

88 

958 

88 

964 

88 

969 

88 

975 

88 

981 

776 

88 

986 

88 

992 

88 

997 

89 

003 

89 

009 

89 

014 

89 

020 

89 

025 

89 

031 

89 

037 

777 

89 

042 

89 

048 

89 

053 

89 

069 

89 

064 

89 

070 

89 

076 

89 

081 

89 

087 

89 

092 

778 

89 

098 

89 

104 

89 

109 

89 

m 

89 

120 

89 

126 

89 

131 

89 

137 

89 

143 

89 

148 

779 

89 

154 

89 

159 

89 

165 

89 

170 

89 

176 

89 

182 

89 

187 

89 

193 

89 

198 

89 

204 

780 

89 

209 

89 

213 

89 

221 

89 

226 

89 

232 

89 

237 

89 

243 

89 

248 

89 

264 

89 

260 

781 

89 

265 

89 

271 

89 

276 

89 

282 

89 

287 

89 

293 

89 

298 

89 

304 

89 

310 

89 

316 

782 

89 

321 

89 

326 

89 

332 

89 

337 

89 

343 

89 

348 

89 

354 

89 

360 

89 

365 

89 

371 

783 

89 

376 

89 

382 

89 

387 

89 

393 

89 

398 

89 

404 

89 

409 

89 

415 

89 

421 

89 

426 

784 

89 

432 

89 

437 

89 

443 

89 

448 

89 

454 

89 

459 

89 

46S 

89 

470 

89 

476 

89 

481 

785 

89 

00 

89 

492 

89 

498 

89 

504 

89 

609 

89 

615 

89 

620 

89 

526 

89 

631 

89 

637 

786 

89 

542 

89 

548 

89 

553 

89 

669 

89 

664 

89 

670 

89 

575 

89 

681 

89 

586 

89 

692 

787 

89 

597 

89 

603 

89 

609 

89 

614 

89 

620 

89 

625 

89 

631 

89 

636 

89 

642 

89 

647 

788 

89 

653 

89 

658 

89 

664 

89 

669 

89 

675 

89 

680 

89 

686 

89 

691 

89 

697 

89 

702 

789 

89 

708 

89 

713 

89 

719 

89 

724 

89 

730 

89 

735 

89 

741 

89 

746 

89 

762 

89 

767 

790 

89 

763 

89 

768 

89 

774 

89 

779 

89 

785 

89 

790 

89 

796 

89 

801 

89 

807 

89 

812 

791 

89 

818 

89 

823 

89 

829 

89 

834 

89 

840 

89 

845 

89 

851 

89 

856 

89 

862 

89 

867 

792 

89 

873 

89 

878 

89 

883 

89 

889 

89 

894 

89 

900 

89 

905 

89 

911 

89 

916 

89 

922 

793 

89 

927 

89 

933 

89 

938 

89 

944 

89 

949 

89 

95g 

89 

960 

89 

966 

89 

971 

89 

977 

794 

89 

982 

89 

988 

89 

993 

89 

998 

90 

004 

90 

009 

90 

01§ 

90 

020 

90 

026 

90 

031 

795 

90 

037 

90 

042 

90 

048 

90 

063 

90 

069 

90 

064 

90 

069 

90 

075 

90 

080 

90 

086 

796 

90 

091 

90 

097 

90 

102 

90 

108 

90 

113 

90 

119 

90 

124 

90 

129 

90 

135 

90 

140 

797 

90 

146 

90 

151 

90 

157 

90 

162 

90 

168 

90 

173 

90 

179 

90 

184 

90 

189 

90 

195 

798 

90 

200 

90 

206 

90 

211 

90 

217 

90 

222 

90 

227 

90 

233 

90 

238 

90 

244 

90 

249 

799 

90 

255 

90 

260 

90 

266 

90 

271 

90 

276 

90 

282 

90 

287 

90 

293 

90 

298 

90 

304 

No 


0 


1 


2 


3 


4 


5 


6 


7 

8 

9 


760-799 







ArrMJSDix 
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Table 7. — ^Five-place Common Logarithms op Nitmbers — {Continued) 

800-849 


No 

r 

0 


1 


2 


3 


4 


5 


6 


7 


8 

9 


90 

309 

90 

314 

90 

320 

90 

32S 

90 

331 

90 

336 

90 

342 

90 

347 

90 

852 

90 

358 


90 

363 

90 

369 

90 

374 

90 

380 

90 

385 

90 

390 

90 

396 

90 

401 

90 

407 

90 

412 


90 

417 

90 

423 

90 

428 

90 

434 

90 

439 

90 

446 

90 

450 

90 

455 

90 

461 

90 

466 


90 

472 

90 

477 

90 

482 

90 

488 

90 

493 

90 

499 

90 

504 

90 

509 

90 

51§ 

90 

520 

< 804 

90 

626 

90 

531 

90 

636 

90 

642 

90 

647 

90 

553 

90 

558 

90 

563 

90 

569 

90 

574 

805 

90 

580 

90 

sss 

90 

690 

90 

696 

90 

601 

90 

607 

90 

612 

90 

617' 

90 

623 

90 

628 

806 

90 

634 

90 

639 

90 

644 

90 

650 

90 

655 

90 

660 

90 

666 

90 

671 

90 

677 

90 

682 

807 

90 

687 

90 

693 

90 

698 

90 

703 

90 

709 

90 

714 

90 

720 

90 

725 

90 

730 

90 

736 

808 

90 

741 

90 

747 

90 

752 

90 

757 

90 

763 

90 

768 

90 

773 

90 

779 

90 

784 

90 

789 

809 

90 

795 

90 

800 

90 

806 

90 

811 

90 

816 

90 

822 

90 

827 

90 

832 

90 

838 

90 

843 

810 

90 

849 

90 

854 

90 

859 

90 

865 

90 

870 

90 

875 

90 

881 

90 

886 

90 

891 

90 

897 

811 

90 

902 

90 

907 

90 

913 

90 

918 

90 

924 

90 

929 

90 

934 

90 

940 

90 

94g 

90 

950 

812 

90 

956 

90 

961 

90 

966 

90 

972 

90 

977 

90 

982 

90 

988 

90 

993 

90 

998 

91 

004 

813 

91 

009 

91 

014 

91 

020 

91 

025 

91 

030 

91 

036 

91 

041 

91 

046 

91 

052 

91 

057 

814 

91 

062 

91 

068 

91 

073 

91 

078 

91 

084 

91 

089 

91 

094 

91 

100 

91 

105 

91 

110 

815 

91 

116 

91 

121 

91 

126 

91 

132 

91 

137 

91 

142 

91 

148 

91 

153 

91 

158 

91 

164 

816 

91 

169 

91 

174 

91 

180 

91 

185 

91 

190 

91 

196 

91 

201 

91 

206 

91 

212 

91 

217 

817 

91 

222 

91 

228 

91 

233 

91 

238 

91 

243 

91 

249 

91 

254 

91 

259 

91 

265 

91 

270 

818 

91 

275 

91 

281 

91 

286 

91 

291 

91 

297 

91 

302 

91 

307 

91 

312 

91 

318 

91 

323 

819 

91 

328 

91 

334 

91 

339 

91 

344 

91 

350 

91 

355 

91 

360 

91 

365 

91 

371 

91 

376 

820^ 

91 

381 

91 

387 

91 

392 

91 

397 

91 

403 

91 

408 

91 

413 

91 

418 

91 

424 

91 

429 

821 

91 

434 

91 

440 

,91 

445 

91 

450 

91 

455 

91 

461 

91 

466 

91 

471 

91 

477 

91 

482 

822 

91 

487 

91 

492 

91 

498 

91 

603 

91 

508 

91 

614 

91 

519 

91 

524 

91 

629 

91 

535 

823 

91 

540 

91 

545 

91 

651 

91 

666 

91 

561 

91 

666 

91 

672 

91 

577 

91 

582 

91 

587 

824 

91 

593 

91 

598 

91 

603 

91 

609 

91 

614 

91 

619 

91 

624 

91 

630 

91 

635 

91 

640 

825 

91 

645 

91 

651 

91 

656 

91 

661 

91 

666 

91 

672 

91 

677 

91 

682 

91 

687 

91 

693 

826 

91 

698 

91 

703 

91 

709 

91 

714 

91 

719 

91 

724 

91 

730 

91 

73§ 

91 

740 

91 

745 

827 

91 

751 

91 

756 

91 

761 

91 

766 

91 

772 

91 

777 

91 

782 

91 

787 

91 

793 

91 

798 

828 

91 

803 

91 

808 

91 

814 

91 

819 

91 

824 

91 

829 

91 

834 

91 

840 

91 

845 

91 

850 

829 

91 

855 

91 

861 

91 

866 

91 

871 

91 

876 j 

91 

882 

91 

887 

91 

892 

91 

897 

91 

903 

830 

91 

908 

91 

913 

91 

918 

91 

924 

91 

929 

91 

934 

91 

939 

91 

944 

91 

950 

91 

955 

831 

91 

960 

91 

965 

91 

971 

91 

976 

91 

981 

91 

986 

91 

991 

91 

997 

92 

002 

92 

007 

832 

92 

012 

92 

018 

92 

023 

92 

028 

92 

033 

92 

038 

92 

044 

92 

049 

92 

054 

92 

059 

833 

92 

065 

92 

070 

92 

075 

92 

080 

92 

085 1 

92 

091 

92 

096 

92 

101 

92 

106 

92 

111 

834 

92 

117 

92 

122 

92 

127 

92 

132 

92 

137 

92 

143 

92 

148 

92 

153 

92 

158 

92 

163 

835 

92 

169 

92 

174 

92 

179 

92 

184 

92 

189 

92 

195 

92 

200 

92 

205 

92 

210 

92 

215 

836 

92 

221 

92 

226 

92 

231 

92 

236 

92 

241 

92 

247 

92 

252 

92 

257 

92 

262 

92 

267 

837 

92 

273 

92 

278 

92 

283 

92 

288 

92 

293 

92 

298 

92 

304 

92 

309 

92 

314 

92 

319 

838 

92 

324 

92 

330 

92 

336 

92 

340 

92 

345 

92 

350 

92 

365 

92 

361 

92 

366 

92 

371 

839 

92 

376 

92 

381 

92 

387 

92 

392 

92 

397 

92 

402 

92 

407 

92 

412 

92 

418 

92 

423 

840 

92 

428 

92 

433 

92 

433 

92 

443 

92 

449 

92 

454 

92 

459 

92 

464 

92 

469 

92 

474 

841 

92 

480 

92 

485 

92 

490 

92 

495 

92 

500 

92 

605 

92 

511 

92 

516 

92 

521 

92 

526 

842 

92 

531 

92 

536 

92 

642 

92 

647 

92 

652 

92 

557 

92 

662 

92 

567 

92 

572 

92 

578 

843 

92 

583 

92 

588 

92 

593 

92 

698 

92 

603 

92 

609 

92 

614 

92 

619 

92 

624 

92 

629 

844 

92 

634 

92 

639 

92 

645 

92 

650 

92 

65§ 

92 

660 

92 

665 

92 

670 

92 

675 

92 

681 

845 

92 

686 

92 

691 

92 

696 

92 

701 

92 

706 

92 

711 

92 

716 

92 

722 

92 

727 

92 

732 

846 

92 

737 

92 

742 

92 

747 

92 

752 

92 

758 

92 

763 

92 

768 

92 

773 

92 

778 

92 

783 

847 

92 

788 

92 

793 

92 

799 

92 

804 

92 

809 

92 

814 

92 

819 

92 

824 

92 

829 

92 

834 

848 

92 

840 

92 

845 

92 

850 

92 

85§ 

92 

860 

92 

865 

92 

870 

92 

875 

92 

881 

92 

886 

849 

92 

891 

92 

896 

92 

901 

92 

906 

92 

911 

92 

916 

92 

921 

92 

927 

92 

932 

92 

937 

No 

] 

0 


1 

2 

8 


4 


5 

1 

6 


7 

8 

1 



800-849 
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Tabm 7. — Fivii-PLACB Common Logaeithms op Numbers — (Continued) 

850-^99 


No. 


0 


1 


2 

3 


i 


5 

6 


r 

8 

9 

860 

92 

942 

92 

947 

92 

952 

92 

957 

92 

962 

92 

967 

92 

973 

92 

978 

92 

983 

92 

988 

851 

92 

993 

92 

998 

93 

003 

93 

008 

93 

013 

93 

018 

93 

024 

93 

029 

93 

034 

93 

039 

852 

93 

044 

93 

049 

93 

054 

93 

059 

93 

064 

93 

069 

93 

075 

93 

080 

93 

085 

93 

090 

858 

93 

095 

93 

100 

93 

105 

93 

110 

93 

115 

93 

120 

93 

125 

93 

131 

93 

136 

93 

141 

854 

93 

146 

93 

151 

93 

166 

93 

161 

93 

166 

93 

171 

93 

176 

93 

181 

93 

186 

93 

192 

855 

93 

197 

93 

202 

93 

207 

93 

212 

93 

217 

93 

222 

93 

227 

93 

232 

93 

237 

93 

242 

866 

93 

247 

93 

252 

93 

258 

93 

263 

93 

268 

93 

273 

93 

278 

93 

283 

93 

288 

93 

293 

857 

93 

298 

93 

303 

93 

308 

93 

313 

93 

318 

93 

323 

93 

328 

93 

334 

93 

339 

93 

344 

858 

93 

349 

93 

354 

93 

359 

93 

364 

93 

369 

93 

374 

93 

379 

93 

384 

93 

389 

93 

394 

859 

93 

399 

93 

404 

93 

409 

93 

414 

93 

420 

93 

425 

93 

430 

93 

435 

93 

440 

93 

445 

860 

93 

4^0 

93 

45S 

93 

460 

93 

465 

93 

470 

93 

475 

93 

480 

93 

485 

93 

490 

93 

495 

861 

93 

600 

93 

505 

93 

510 

94 

515 

93 

620 

93 

526 

93 

531 

93 

636 

93 

641 

93 

646 

862 

93 

551 

93 

556 

93 

561 

93 

566 

93 

571 

93 

576 

93 

581 

93 

586 

93 

691 

93 

696 

S63 

93 

601 

93 

606 

93 

611 

93 

616 

93 

621 

93 

626 

93 

631 

93 

636 

93 

641 

93 

646 

864 

93 

661 

93 

656 

93 

661 

93 

666 

93 

671 

93 

676 

93 

682 

93 

687 

93 

692 

93 

697 

865 

93 

702 

93 

707 

93 

712 

93 

717 

93 

722 

93 

727 

93 

732 

93 

737 

93 

742 

93 

747 

866 

93 

752 

93 

757 

93 

762 

93 

767 

93 

772 

93 

777 

93 

782 

93 

787 

93 

792 

93 

797 

867 

93 

802 

93 

807 

93 

812 

93 

817 

93 

822 

93 

827 

93 

832 

93 

837 

93 

842 

93 

847 

868 

93 

852 

93 

857 

93 

862 

93 

867 

93 

872 

93 

877 

93 

882 

93 

887 

93 

892 

93 

897 

869 

93 

902 

93 

907 

93 

912 

93 

917 

93 

922 

93 

927 

93 

932 

93 

937 

93 

942 

93 

947 

870 

93 

952 

93 

957 

93 

962 

93 

967 

93 

972 

93 

977 

93 

982 

93 

987 

93 

992 

93 

997 

871 

94 

002 

94 

007 

94 

012 

94 

017 

94 

022 

94 

027 

94 

032 

94 

037 

94 

042 

94 

047 

872 

94 

052 

94 

057 

94 

062 

94 

067 

94 

072 

94 

077 

94 

082 

94 

086 

94 

091 

94 

096 

873 

94 

101 

94 

106 

94 

111 

94 

116 

94 

121 

94 

126 

94 

131 

94 

136 

94 

141 

94 

146 

874 

94 

151 

94 

156 

94 

161 

94 

166 

94 

171 

94 

176 

94 

181 

94 

186 

94 

191 

94 

196 

875 

94 

201 

94 

206 

94 

211 

94 

216 

94 

221 

94 

226 

94 

231 

94 

236 

94 

240 

94 

245 

876 

94 

250 

94 

265 

94 

260 

94 

265 

94 

270 

94 

275 

94 

280 

94 

285 

94 

290 

94 

295 

877 

94 

300 

94 

30§ 

94 

310 

94 

315 

94 

320 

94 

325 

94 

330 

94 

335 

94 

340 

94 

345 

878 

94 

349 

94 

354 

94 

369 

94 

364 

94 

369 

94 

374 

94 

379 

94 

384 

94 

389 

94 

394 

879 

94 

399 

94 

404 

94 

409 

94 

414 

94 

419 

94 

424 

94 

429 

94 

433 

94 

438 

94 

443 

880 

94 

448 

94 

453 

94 

458 

94 

463 

94 

468 

94 

473 

94 

478 

94 

483 

94 

488 

94 

CO 

881 

94 

498 

94 

503 

94 

507 

94 

612 

94 

617 

94 

522 

94 

627 

94 

532 

94 

637 

94 

542 

882 

94 

647 

94 

552 

94 

657 

94 

662 

94 

667 

94 

571 

94 

576 

94 

581 

94 

686 

94 

591 

883 

94 

696 

94 

601 

94 

606 

94 

611 

94 

616 

94 

621 

94 

626 

94 

630 

94 

635 

94 

640 

884 

94 

645 

94 

650 

94 

655 

94 

660 

94 

665 

94 

670 

94 

675 

94 

680 

94 

685 

94 

689 

885 

94 

694 

94 

699 

94 

704 

94 

709 

94 

714 

94 

719 

94 

724 

94 

729 

94 

734 

94 

738 

886 

94 

743 

94 

748 

94 

763 

94 

758 

94 

763 

94 

768 

94 

773 

94 

778 

94 

783 

94 

787 

887 

94 

792 

94 

797 

94 

802 

94 

807 

94 

812 

94 

817 

94 

822 

94 

827 

94 

832 

94 

836 

888 

94 

841 

94 

846 

94 

851 

94 

856 

94 

861 

94 

866 

94 

871 

94 

876 

94 

880 

94 

885 

889 

94 

890 

94 

895 

94 

900 

94 

905 

94 

910 

94 

915 

94 

919 

94 

924 

94 

929 

94 

934 

890 

94 

939 

94 

944 

94 

949 

94 

954 

94 

959 

94 

963 

94 

968 

94 

973 

94 

978 

94 

983 

891 

94 

988 

94 

993 

94 

998 

95 

002 

95 

007 

95 

012 

95 

017 

95 

022 

95 

027 

95 

032 

892 

95 

036 

95 

041 

96 

046 

95 

061 

95 

056 

95 

061 

95 

066 

95 

071 

95 

075 

96 

080 

893 

95 

085 

95 

090 

95 

09S 

95 

100 

95 

105 

95 

109 

95 

114 

95 

119 

95 

124 

95 

129 

894 

95 

134 

93 

139 

05 

143 

95 

148 

95 

153 

95 

158 

93 

163 

95 

168 

95 

173 

95 

177 

895 

95 

182 

95 

187 

95 

192 

95 

197 

95 

202 

95 

207 

95 

211 

95 

216 

96 

221 

95 

226 

896 

95 

231 

95 

236 

95 

240 

95 

245 

95 

250 

95 

255 

96 

260 

95 

265 

95 

270 

95 

274 

897 

95 

279 

95 

284 

95 

289 

95 

294 

95 

299 

95 

303 

95 

308 

95 

313 

95 

318 

95 

323 

898 

95 

328 

95 

332 

95 

337 

95 

342 

95 

347 

95 

352 

95 

357 

95 

361 

95 

366 

95 

371 

899 

95 

376 

95 

381 

95 

386 

95 

390 

95 

395 

95 

400 

95 

405 

95 

410 

95 

415 

95 

419 

No. 


0 


1 


2 


8 


4 


5 


6 


7 

8 

9 


860-899 
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Table 7.— Frra-PLAcs Common Logarithms of Ntjmbbbs — (Confontjed) 

900-949 


No. 

0 

1 

2 

8 

4 

900 

95 424 

95 429 

95 434 

95 439 

95 444 

901 

95 472 

96 477 

95 482 

95 487 

95 492 

902 

95 521 

95 625 

95 530 

95 635 

95 540 

903 

95 569 

95 674 

95 578 

96 683 

95 588 

904 

95 617 

05 622 

95 626 

95 631 

95 636 

905 

95 665 

95 670 

95 674 

95 679 

95 684 

906 

95 713 

95 718 

95 722 

95 727 

95 732 

907 

95 761 

95 766 

95 770 

95 776 

95 780 

908 

95 809 

95 813 

95 818 

95 823 

95 828 

909 

95 856 

95 861 

95 866 

95 871 

95 875 

910 

95 904 

95 909 

95 914 

95 918 

95 923 

911 

95 952 

95 957 

95 961 

95 966 

95 971 

912 

95 999 

96 004 

96 009 

96 014 

96 019 

913 

96 047 

96 052 

96 057 

96 061 

96 066 

914 

96 095 

96 099 

96 104 

96 109 

96 114 

915 

96 142 

96 147 

96 152 

96 156 

96 161 

916 

96 190 

96 194 

96 199 

96 204 

96 209 

917 

96 237 

96 242 

96 246 

96 251 

96 256 

918 

96 284 

96 289 

96 294 

96 298 

96 303 

919 

96 332 

96 336 

96 341 

96 346 

96 350 

920 

96 379 

96 884 

96 388 

96 393 

96 398 

921 

96 426 

96 431 

96 435 

96 440 

96 445 

922 

96 473 

96 478 

96 483 

96 487 

96 492 

923 

96 520 

06 525 

96 630 

96 534 

96 639 

924 

96 567 

96 572 

96 577 

96 581 

96 686 

925 

96 614 

96 619 

96 624 

96 628 

96 633 

926 

96 661 

96 666 

96 670 

96 675 

96 680 

927 

96 708 

96 713 

96 717 

96 722 

96 727 

928 

96 755 

96 759 

96 764 

96 769 

96 774 

929 

96 802 

96 806 

96 811 

96 816 

96 820 

930 

96 848 

96 853 

96 858 

96 862 

96 867 

931 

96 895 

96 900 

96 904 

96 909 

96 914 

932 

96 942 

96 946 

96 951 

96 956 

96 960 

933 

96 988 

96 993 

96 997 

97 002 

07 007 

934 

97 035 

97 039 

97 044 

97 049 

97 053 

935 

97 081 

97 086 

97 090 

97 095 

97 100 

936 

97 128 

97 132 

97 137 

97 142 

97 146 

937 

97 174 

97 179 

97 183 

97 188 

97 192 

938 

97 220 

97 225 

97 230 

97 234 

97 239 

939 

97 267 

97 271 

97 276 

97 280 

97 285 

940 

97 313 

97 317 

97 322 

97 327 

97 331 

941 

97 859 

97 364 

97 368 

97 373 

97 377 

942 

97 405 

97 410 

97 414 

97 419 

97 424 

943 

97 451 

97 456 

97 460 

97 465 

97 470 

944 

97 497 

97 502 

97 506 

97 611 

97 516 

945 

97 543 

97 648 

97 552 ' 

97 667 

97 662 

946 

97 589 

97 594 

97 698 

97 603 

97 607 

947 

97 635 

97 640 

97 644 

97 649 

97 653 

948 

97 681 

97 683 

97 690 

97 695 

97 699 

949 

97 727 

97 731 

97 736 

97 740 

97 745 

No. 

0 

1 

2 

3 

4 


95 448 
95 497 
95 545 
95 693 
95 641 

95 689 
95 737 
95 78g 
95 832 
95 880 

95 928 

95 976 

96 023 
96 071 
96 118 

96 166 
96 213 
96 261 
96 308 
96 353 

96 402 
96 450 
96 497 
96 644 
96 691 

96 638 
96 685 
96 731 
96 778 
96 825 

96 872 
96 918 

96 965 

97 011 
97 058 

97 104 
97 151 
97 197 
97 243 
97 290 

97 336 
97 382 
97 428 
97 474 
97 620 

97 566 
97 612 
97 658 
97 704 
97 749 


95 463 
95 501 
95 550 
95 698 
95 646 

95 694 
95 742 
95 789 
95 837 
95 885 

95 933 

95 980 

96 028 
96 076 
96 123 

96 171 
96 218 
96 265 
96 313 
96 360 

96 407 
96 454 
96 601 
96 648 
96 695 

96 642 
96 689 
96 736 
96 783 
96 830 

96 876 
96 923 

96 970 

97 016 
97 063 

97 109 
97 165 
97 202 
97 248 
97 294 

97 340 
97 387 
97 433 
97 479 
97 525 

97 671 
97 617 
97 663 
97 708 
97 764 


95 458 
95 606 
95 664 
95 602 
95 650 

95 69$ 

95 746 

96 794 
95 842 
95 890 

95 988 

96 985 
96 033 
96 080 
96 128 

96 175 
96 223 
96 270 
96 317 
96 365 

96 412 
96 459 
96 606 
96 653 
96 600 

96 647 
96 694 
96 741 
96 788 
96 834 

96 881 
96 928 

96 974 

97 021 
97 067 

97 114 
97 160 
97 206 
97 263 
97 299 

97 345 
97 391 
97 437 
97 483 
97 529 

97 675 
97 621 
97 667 
97 713 
97 759 


95 463 
95 611 
95 659 
95 607 
95 655 

95 703 
95 751 
93 799 
95 847 
95 895 

95 942 

95 990 

96 038 
96 085 
96 133 

96 180 
96 227 
96 275 
96 322 
96 369 

96 417 
96 464 
96 511 
96 558 
96 605 

96 652 
96 699 
96 745 
96 792 
96 839 


96 032 

96 979 

97 025 
97 072 

97 118 
97 165 
97 211 
97 257 
97 304 

97 350 
97 396 
97 442 
97 488 
97 534 

97 580 
97 626 
97 672 
97 717 
97 763 


95 468 
95 516 
95 564 

95 612 

96 660 

96 708 
95 766 
95 804 
95 852 
95 899 

95 947 

95 995 

96 042 
96 090 
96 137 

96 185 
96 232 
96 280 
96 327 
96 374 

96 421 
96 468 
96 516 
96 562 
96 609 

96 656 
96 703 
96 750 
96 797 
96 844 

96 890 
96 937 

96 984 

97 030 
97 077 

97 123 
97 169 
97 216 
97 262 
97 308 

97 354 
97 400 
97 447 
97 493 
97 539 

97 585 
97 630 
97 676 
97 722 
97 768 


900-949 
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Table 7. — ^Five-placb Common Looabithms oe Ntjmbbhs — (jContinuei) 

960-1000 


No. 

0 

1 

2 

3 


i 


5 


1 

1 


S 

9 

fSO 

97 

772 

97 777 

97 

782 

97 

786 

97 

791 

97 

795 

97 

800 

97 

804 

97 

809 

97 

813 

951 

97 

818 

97 823 

97 

827 

97 

832 

97 

836 

97 

841 

97 

845 

97 

850 

97 

855 

97 

859 

952 

97 

864 

97 868 

97 

873 

97 

877 

97 

882 

97 

886 

97 

891 

97 

896 

97 

900 

97 

905 

953 

97 

909 

97 914 

97 

918 

97 

923 

97 

928 

97 

932 

97 

937 

97 

941 

97 

946 

97 

950 

954 

97 

m 

97 959 

97^ 

964 

97 

968 

97 

973 

97 

978 

97 

982 

97 

987 

97 

991 

97 

996 

955 

98 

000 

98 005 

98 

009 

98 

014 

98 

019 

98 

023 

98 

028 

98 

032 

98 

037 

98 

041 

956 

98 

046 

98 050 

98 

055 

98 

059 

98 

064 

98 

068 

98 

073 

98 

078 

98 

082 

98 

087 

957 

98 

091 

98 096 

98 

100 

98 

105 

98 

109 

98 

114 

98 

118 

98 

123 

98 

127 

98 

132 

958 

98 

137 

98 141 

98 

146 

98 

150 

98 

155 

98 

159 

98 

164 

98 

168 

98 

173 

98 

177 

959 

98 

182 

93 186 

93 

191 

98 

195 

98 

200 

98 

204 

98 

209 

98 

214 

98 

218 

98 

223 

960 

98 

227 

98 222 

98 

236 

98 

241 

98 

245 

98 

250 

98 

254 

98 

259 

98 

263 

9$ 

268 

961 

98 

272 

98 277 

98 

281 

98 

286 

98 

290 

98 

295 

08 

299 

98 

304 

98 

SOS 

98 

813 

962 

98 

318 

98 322 

98 

327 

98 

331 

98 

336 

98 

340 

98 

345 

98 

349 

98 

354 

98 

358 

963 

98 

363 

98 367 

98 

372 

98 

376 

98 

381 

98 

385 

98 

390 

98 

894 

98 

399 

98 

403 

964 

98 

408 

98 412 

98 

417 

98 

421 

98 

426 

98 

430 

98 

435 

98 

439 

98 

444 

98 

448 

965 

98 

453 

98 457^98 

462 

98 

466^ 

98 

471 

98 

475 

98 

480 

98 

484 

98 

489 

98 

493 

966 

98 

498 

98 502 

98 

607 

98 

511 

98 

516 

98 

620 

98 

625 

98 

529 

98 

534 

98 

538 

967 

98 

543 

98 647 

98 

662 

98 

656 

08 

661 

08 

565 

98 

670 

98 

574 

98 

679 

98 

583 

968 

98 

688 

98 592 

98 

697 

98 

601 

98 

605 

98 

610 

08 

614 

98 

619 

98 

623 

98 

628 

969 

98 

632 

98 637 

98 

641_98 

646 

98 

650 

98 

655 

98 

659 

98 

664 

98 

668 

98 

673 

970 

98 

677 

98 682 

98 

686 

98 

691 

98 

695 

98 

700 

98 

704 

98 

709 

98 

713 

98 

717 

071 

98 

722 

98 726 

98 

731 

98 

735 

98 

740 

98 

744 

08 

749 

98 

753 

98 

758 

98 

762 

972 

98 

767 

98 771 

98 

776 

93 

780 

98 

784 

98 

789 

98 

793 

98 

798 

98 

802 

98 

807 

073 

98 

811 

98 816 

98 

820 

98 

825 

98 

829 

98 

834 

9S 

838 

98 

843 

08 

847 

98 

851 

974 

98 

856 

98 860 

98 

865 

98 

869 

98 

874 

98 

878 

98 

883 

98 

887 

93 

892 

08 

896 

975 

98 

900 

98 905 

98 

909 

98 

914 

98 

918 

98 

923 

98 

927 

98 

932 

98 

936 

98 

941 

976 

98 

945 

98 949 

98 

954 

98 

968 

98 

963 

98 

967 

98 

972 

98 

976 

98 

981 

98 

985 

977 

98 

989 

98 994 

98 

998 

99 

003 

99 

007 

99 

012 

99 

016 

99 

021 

99 

025 

99 

029 

978 

99 

034 

99 038 

99 

043 

99 

047 

99 

062 

99 

056 

99 

061 

99 

065 

99 

069 

99 

074 

979 

99 

078 

99 083 

99 

087 

99 

092 

99 

096 j 

09 

100 

99 

105 

99 

109 

99 

114 

99 

118 

980 

99 

123 

99 127 

99 

131 

99 

136 

99 

140 

99 

145 

99 

149 

99 

154 

99 

168 

09 

162 

981 

99 

167 

99 171 

99 

176 

99 

180 

99 

185 

99 

189 

99 

193 

99 

198 

99 

202 

99 

207 

982 ! 

99 

211 

99 2X6 

99 

220 

99 

224 

99 

229 

99 

233 

99 

238 

09 

242 

99 

247 

99 

251 

983 

99 

255 

99 260 

99 

264 

99 

269 

09 

273 

99 

277 

99 

282 

09 

286 

99 

291 

99 

295 

984 

99 

300 

99 804 

99 

308 

99 

313 

99 

317 

99 

822 

99 

326 

09 

330 

99 

335 

99 

330 

985 

99 

844 

99 348 

99 

352 

99 

357 

99 

361 

09 

866 

90 

370 

99 

374 

99 

379 

99 

383 

986 

99 

388 

99 392 

99 

396 

99 

401 

99 

405 

99 

410 

09 

414 

99 

419 

99 

423 

99 

427 

987 

99 

432 

99 436 

99 

441 

99 

445 

99 

449 

99 

434 

99 

458 

99 

463 

99 

467 

99 

471 

988 1 

99 

476 

99 480 

99 

484 

99 

489 

09 

493 

99 

498 

99 

602 

99 

606 

99 

511 

99 

515 

989 1 

99 

520 

99 524 

99 

528 

99 

533 

99 

537 

99 

542 

99 

546 

99 

550 

99 

555 

99 

559 

990 

99 

564 

99 568 

99 

572 

99 

577 

99 

581 

99 

585 

99 

590 

99 

594 

99 

599 

99 

603 

991 

99 

607 

99 612 

99 

616 

99 

621 

09 

625 

99 

629 

99 

634 

99 

638 

99 

642 

99 

647 

992 

99 

651 

99 656 

99 

660 

99 

664 

99 

669 

99 

673 

09 

677 

99 

682 

99 

686 

99 

691 

993 

99 

695 

99 699 

99 

704 

99 

708 

99 

712 

99 

717 

99 

721 

99 

726 

99 

730 

99 

734 

994 

99 

739 

99 743 

99 

747 

99 

762 

09 

756 

99 

760 

99 

765 

99 

769 

99 

774 

99 

778 

995 

99 

782 

99 787 

99 

791 

99 

795 

99 

800 

99 

804 

99 

808 

99 

813 

99 

817 

99 

822 

996 

99 

826 

99 830 

99 

835 

99 

839 

99 

843 

09 

848 

99 

852 

99 

856 

99 

861 

99 

865 

997 

99 

870 

99 874 

99 

878 

09 

883 

99 

887 

99 

891 

99 

896 

00 

900 

99 

904 

99 

909 

998 

99 

913 

99 917 

99 

922 

09 

926 

99 

930 

99 

935 

99 

939 

99 

944 

99 

948 

99 

952 

999 

99 

957 

99 961 

99 

965 

99 

970 

09 

974 

99 

978 

99 

983 

99 

987 

99 

991 

99 

996 

1000 

00 

000 

00 004 

00 

009 

00 

013 

00 

017 

00 

022 

00 

026 

00 

030 

00 

035 

00 

039 

No. 


0 

1 

•V 

1 

■ 


8 


4 

i! 

5 ' 


6 


7 


8 

/ 

a 


950-1000 













Index 


A 

Accuracy, testing a statistical sched- 
ule for, 42 

Actuarial method, 24r-25 
Ahenation, coeflSlcient of, 182, 190, 
193 

Analysis of statistical data, 50 
Arithmetic mean, 99 

{See also Mean, arithmetic) 
Arkin, Herbert, 50 
Array, frequency, 60-61, 66 
Attribute, 28, 231, 234 
Average, need for, 94 
representativeness of, 108, 130- 
131 

Average deviation, 122-124 
(See also Deviation, mean) 
Averages, 94 

B 

Bar charts, 88-89 

Barr, A S , C. V. Good, and D. E. 
Scates, 30 

Baten, W D , 170, 254 

Bernard, L L , 9 

Bernoulli sample, 224 

Beta, /3, measure of kurtosis, 168 

Bias, 32 

Bnnodal, 95 

Bmet, Stanford-, intelligence test, 12, 
19 

Bmomial coefiScients, 305 
Bmomial distribution, 151-156 
as 3 unmetncal (skewed), 156 
formulas for, 151, 152 
mean of, formula for the, 155 
standard deviation of, formula for 
the, 155, 234 


Bmomial distribution, umverse, 233 
Bisenal correlation, 199-203 

(See also Correlation, biserial) 
Bowley, A. L , 51, 53 
Brown, Lyndon 0 , 55 
Burgess, E. W., and L. J. Cottrell, 
20 

Burtt, E. A., 11, 30 
C 

Camp, B. H , 170, 195 
Campbell, N R , 23 
Caption of a frequency table, 71 
Cardinal number, 15 
Cards, machme tabulating, 48-49 
Causal system, samplmg a, 229 
Causes, search for, 26 
Census, Umted States Bureau of the, 
3, 33 

Census of Agriculture, XJ. S., 1935, 
34 

defimtion of a ^^farm,'^ 34r-36 
other definitions, 36 
Chaddock, R E , 30, 75, 121, 142, 
195, 297 

Changmg universe, 222 
Chapm, F Stuart, 12, 20, 23, 30, 55 
Charher check, 127 
Chi-square, x*, 304 
substitute for standard error of 
coefficient of contmgency, 217 
test applied to a contingency 
table, 148-149, 205-206, 208 
to a fourfold table, 209 
used to test sigmficance of differ- 
ences between two frequency 
distributions, 269-272 
Class mtervals, 61 
selection of, 64-68 
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Class limits, contmuous vanable, 69 
discrete vanable, 69 
Classes, 61 
Classification, 10 
prmciples of, 69 
reliability of, 197-198 
Codmg, 129 

use of, in computmg measures of 
dispersion and partition, 129 
Coefl&cient of alienation, in Imear 
correlation, formulas for, 182, 
183, 190 

Coefficient of contingency, 203-208 
Cbi-square as substitute for 
standard error, 217 
computation of, 204-206 
correction for broad groupmg, 207 
formulas for, 206 
interpretation of, 208 
sign of, 208 
standard error of, 217 
tabular arrangement for, 204 
Coefficient of correlation, for 
fourfold tables, 211 
standard error of, 217 
Coefficient of Imear correlation, r, 
grouped data, formulas for, 
185 

sigmficance of, 257-258 
sigmficance of the difference 
between two r^s, 268-269 
values of the correlation coeffi- 
cient for different levels of 
significance, 306 
values of 2 for given values of r, 
307-308 

ungrouped data, formulas for, 
181, 183, 185 
meanmg of, 182-184 
size of sample, 182 
Coefficient of regression, Imear cor- 
relation, 180 

Coefficient of variation, 129-131 
(See also Variation, coefficient 
of) 

Combinations, 143-144 
formula for, 144 


Comparable measures (scores, 
scales), 136-139 
percentiles, 137-138 
Q scores, 137 
standard scores, 136 
Concomitant variation, 26 
Confidence limits, 248-249 
(See also Fiducial limits) 
Contmgency, coefficient of, 203-208 
(See also Coefficient of contm- 
gency) 

Contmgency table, 25 
Contmuous variable, 28, 62 
Control group, 26 
Cooperative definition, 44r-45 
Coordmates of a pomt, 82, 172 
Correlation, biserial, 199-203 
formula for rbaa, 201 
rbm compared with r, 203 
scatter diagram, 200 
sign of fbis, 203 
standard error of rbu, 217 
table, 201 

contmgency, 203-208 

(See also Coefficient of con- 
tmgency) 

m fourfold tables, 208-217 

(See also Yule^s Q, Coefficient 
of correlation, r 4 , for four- 
fold tables; Tetrachoric 
correlation) 
nonquantitative, 197 
biserial, 199-203 

(See also Biserial correla- 
tion; Correlation, bi- 
serial) 

choice of method, 198-199 
coefficient of contmgency, 203- 
208 

(See also Coefficient of 
contmgency; Correlation 
contmgency) 

r 4 , for fourfold tables, 211 
tetrachoric correlation, 211-217 
Yule^s Q, 210, 213 
rank, 191 
formula for, 191 
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Correlation, simple Imear quantita- 
tive, 171-196 

grouped data, correlation table 
and its explanation, 186-189 
formula for coefficient of 
alienation, 190 
formula for r, 185 
formula for standard error of 
estimate, 190 

formula for F-intercept, 190 
formulas for regression coeffi- 
cient, 190 

ungrouped data, coefficient of 
alienation, fc, 182-183 
coefficient of correlation, r, 
measuring amount of cor- 
relation, 180-184 
correlation due to a single 
case, 172 

does not extend beyond data, 
173-174 

formulas for r, 181-183 
goodness of fit and standard 
error of estimate, 177-180 
Ime of regression, 175-180, 
184, 185 

negative, 174^175 
normal equations, 175-176 
positive, 174 

regression coefficient, 180 
scatter diagram, 171-174 
tetrachoric, 211-217 

{See also Tetrachoric correla- 
tion) 

between time series, 286-288 
Cottrell, L J., and E W. Burgess, 20 
Countmg, 10 

Cowden, D J , and F E. Croxton, 
23, 93, 121, 142, 170, 182, 195, 
254, 297 

Critical ratio, 258 
Crosshatching, 90-91 
Croxton, F. E , and D, J. Cowden, 
23, 93, 121, 142, 170, 182, 195, 
254, 297 

Culver, Dorothy C , 34 
Cumulative frequency curve (ogive), 
79-81 


Curve, of error, 157 

(See also Normal curve) 
of probabilities, 157 

{See also Normal curve) 
Cycles, correlation between, in two 
senes, 286-288 
short-term, 283-286 
short-term, freed from seasonal 
fluctuations, 295—296 
in time senes, 286 

D 

Dampier-TVhetham, W C D , 9 
Davenport, C B , and M. P 
Ekas, 217, 220 

Davies, G. II , and Dale Yoder, 
142, 195, 297 
Deciles, 131, 134 
Defimtion, 10, 44Ht5 
Degrees of freedom, 148-149 
Delta, A, 95 

Deviation, mean or average, 122-124 
formula for, 123 
measures of, 122-142 
from an average, 122 
use of codmg in computation, 
129 

quartile, 135-136 

{See also Quartile deviation) 
standard, <r, 124-129 
computation, grouped data, long 
method, 127 
short method, 127-128 
imgrouped data, long method, 
126 

short method, 126 
formula for, combmed distribu- 
tions, 128-129 
Sheppard^s correction, 128 
ungrouped and grouped data, 
124^125 
Dewey, John, 30 
Dichotomy, 24, 208 
Differences, between any two statis- 
tics, 259-260 

significance of samphng, 255-275 
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Differences, between statistics from 
more than two samples, 272-273 
Discrete aggregate, 18 
Discrete yanable, 62 
Dispersion {see Deviation) 
Distribution, 232 
sampling, 232 

{See also Frequency distribu- 
tion) 

Districts, 243 

standard errors of sampling, 243- 
244, 246 

Durost, W. N , and Helen M. 
Walker, 75 

E 

Editmg the statistical schedule, 47 
Ekas, M P , and C. B Davenport, 
217, 220 

Elderton, W P , 220 
Elmer, M. C , 55 
Empirical standard error, 232 
Equally likely events, 149 
Error, accumulative, 52-53 
curve of, 157 

(See also Normal curve) 
of observation (record), 50-54 
probable, 161, 232 

(See also Probable error) 
in a ratio, 53 
relative, 52 

standard, 161, 217, 232-249 
(See also Standard error) 
Errors, biased, 50, 53 
unbiased (compensatmg), 52 
Event, 145, 221, 243-244, 246 
Existent umverse, 222 
Expected value, 221 
Experimental group, 26 
Exponent, 109 

Ezekiel, Mordecai, 29, 182, 183, 195 
F 

Factor control, 24 

Failure (unsuccessful event), 149, 
222 


Farm, defimtion of a, U S. Census 
of Agriculture, 1935, 33-36 
Federal agencies as sources of 
statistical data, 33 
Fiducial limits, 248-249 

(See also Confidence limits) 
Fmal test, 28 
Fme, H. B., 170 
Fisher, R A , 30, 182, 195, 306 
Fourfold tables, correlation m, 208- 
217 

Fourth moment, 165 
Freedom, degrees of, 148, 149 
(See also Degrees of freedom) 
Frequencies, 60 
Frequency, 235-239 
standard error of simple samphng 
of a, 235—239 

of stratified samplmg of a, 237 
Frequency array, 60-61, 66 
Frequency distribution, 60-69, 71- 
72, 107-108, 269-272 
contmuous variable, tabulation of, 
68-69 

discrete vanable, tabulation of, 
60-68 

rules of table form, 71-72 
shapes of, 107-108 
significance of the difference be- 
tween two or more, 269-272 
Frequency distributions, nonquanti- 
tative vanable, tabulation of, 
69-70 

Frequency polygon, 76-79 
Fry, C Luther, 55 
“Fundamental mterval,’* m social 
measurement, 19 

G 

Qi, index of skewness, 165 
formula for, 165 
significance of, 258-259 
(See also Skewness) 
g 2 j index of kurtosis, 165 
formula for, 165 
significance of, 258-259 
(See also Kurtosis) 
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Galton, Sir Francis, 3 
Garrett, H E , 75, 142, 195 
Gaussian curve, 157 

(See also Normal curve) 
Geometric mean, 109-113 
applied to population growth, 
111-113 

formulas for, 109-110 
Gevorkiantz, S R , and B. D. 

Mudgett, 231 
Giddmgs, F. H , 9 
Good, C V , A. S. Barr, and D E. 
Scates, 30 

Goodness of fit of regression Ime, 
177-178 

Goulden, C. H., 30 
Graphs, 76 
maps, 90-91 
misuse of, 86 
pictographs, 90-91 
^ pie chart, 89 

steepness of a hne, meaning of, 116 
three-dimension, 89 

(See also Bar charts, Cumula- 
tive curve (ogive); Histo- 
gram; Lorenz curve; Polygon; 
Population growth graphs; 
Semilogarithmic graph; 
Smoothed curve) 

Gross reproduction rate, 116-117 
Grouping errors, 128 
Groups of events, 243 
standard errors of samphng, 243- 
244, 246 

Guilford, J. P , 18 
H 

Heterogeneous universe, 223, 246 
Histogram, 76-79 

Holzmger, Karl J., 18, 170, 217. 220 
Homogeneous universe, 223 
Hooton, A E , 219 
Horst, Paul, 137 
Hypothesis, 32 
null, 154 

Hypothetical umverse, 222, 225 


Independent events, 260 
Index, 15, 16, 44, 45 
Individual, the, and statistics, 7 
Infinite umverse, 222 
Instructions accompanying a sta- 
tistical schedule, 3^0 
Intangibles, measurement of, 18-20 
Intercept on the Y axis, 175, 190 
Interfering variables, 29 
Interpretation of statistical results, 7 
Interquartile range, 136 
graph of, 136 

Interviewer, the statistical, 42, 47 
J 

J-type distribution, 107-108 
Jocher, Kathenne, and Howard W. 

Odum, 55 
Johnson, H M , 23 
Judges, use of, m social measure- 
ment, 15, 16 

K 

Karsten, K G , 93 
Kelley, T. L , 142 

Kendall, M G, and G U Yule, 
24, 75, 121, 170, 175, 182, 196, 
220, 254 
Kmg, W 1 , 105 
Kirkpatnck, Clifford, 23 
Kuhhnan, A F , 34 
Kurtosis, 165-168 
formula for, 165 
Qij index of, 165 

L 

Laboratory sciences, 5 
Leptokurtic, 165 

Less-than cumulative frequency 
curve (ogive), 79-81 
Levels of sigmficance, 256-257 
Lexis sample, 231 
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Limited universe, 222 

correction of standard error for, 
242-243 

Lmdquist, E. F., 142, 195 
Lme of regression, 175-180 

(See also Begression, line of) 
Linear correlation, 171-196 
(See also Correlation) 
Logarithms, 323 
five-place, 323-342 
Lorenz curve, 82-83 
Lundberg, G. A , 9, 23, 55 

M 

McCormick, Thomas C , 50, 212 
Maps, 90-91 

Marnage, predictmg success or 
failure m, 20 
Matchmg, 26 

Mathematical statistics, 3, 5 
Mean, arithmetic, 100 
characteristics and interpretation, 
104r-109 

definition of, 100 

grouped data, equal classes, short 
method, 101-103 
long method, 100 
unequal classes, short method, 
103-104 

significance of the difference be- 
tween two means, 264-266 
standard error of simple sampimg 
of the, 239-240 

of stratified sampimg of the, 240 
of two distributions combined, 63, 
104 

ungrouped data, 99 
weighted, 63, 104 
Mean, geometric, 109-113 

(See also Geometric mean) 
Mean deviation, 122-124 

(See also Deviation, mean) 
Mean probability, 230 
Measurement, of amount, 11 
rules of, 21-22 

Mechamcal method, statistics not a, 
8 


Mechanical tabulation of statistical 
data, 48-50 
Median, 97 

characteristics and interpretation 
of, 104-109 
definition of, 97 
grouped data, 97-99 
imgrouped data, 96-97 
Merrill, Maud A , and Lewis M. 

Terman, 23 
Merton, R K , 16 
Mesokurtic, 165 
Mid-pomt, 62-64 

Mills, F C , 9, 75, 142, 196, 254, 
283, 297 
Mode, 94-95 
bimodal distribution, 95 
characteristics and interpretation, 
104r-109 
definition, 94 

formula for, 95 ^ 

Moments, 165-166 
Mu, ju, 165 

Mudgett, B D , 75, 93 
and S R Gevorkiantz, 231 
Mutually exclusive events, 146 

N 

National Unemployment Census of 
1937, 39-42 

Negative correlation, 174^175 
Net reproduction rate, 116-117 
Nonquant itative methods, role of, 31 
Nonquantitative variable, defined, 
69 

tabulation of, 69-70 
Normal distribution (curve), 156- 
163 

approximation of symmetrical bi- 
nomial, 156-157 
areas and ordmates of, 299-303 
calculation of ordmates of, 158 
formulas for, 157 
graphs of, 156 
table showing a, 159 
use in determinmg probabilities, 
160-163 
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Normal equations, straight Ime, 
175-176 

Normalization, 137 
Nu, 165 

Null hypothesis, 154 
O 

Odum, Howard W, and Katherine 
Jocher, 55 
Ogbum, W. F,, 9 
Ogive, 79-81 
Ordered data, 11 
Ordmal number, 15 
Ordmate, 158 
Origms of statistics, 3 

P 

Palmer, Vivien M , 55 
^Parameter, 221 

defimtion of, 221, 231 
Parent, synonym for universe, 221 
Partition values, 131-136 
decile {see Decile) 
median (see Median) 
percentile (see Percentile) 
quartde (see Quartile) 

Pearson, Karl, 3, 217 
Percentile, 131-134, 136 
formula for, 133 
Percentile rank, 134-136 
formula for, 135 
Permutations, 143-144 
formula for, 143 

Peters, C C , and W R Van Voor- 
his, 30, 207, 217, 220, 254 
Pictographs, 90-91 
Pie chart, 89 
Platykurtic, 165 

Poisson (stratified) sample, 224, 
230-231, 234 

Polygon, frequency, 76-79 
Population, synonym for umverse, 
221 

Population growth, 82, 111 
estimates of, 111-113 
graphs of, 82-87 


Population rates, 114-117 
gross reproduction rate, 116-117 
meanmg of, 114-116 
net reproduction rate, 116-117 
standard error of, 244-246 
Positive correlation, 174 
Prediction of a mean vs. individual 
values, 250 
Pretest, 28 

Primary statistical data, 37 
Probabihties, curve of, 157 
{See also Normal curve) 
Probability, 145-151 
addition theorem, 146 
defimtion of, 145 
mean, 230 

product theorem, 147 
of r successes in n trials, formula 
for, 150 

Probable error, 161, 232 
Problem m statistical mquiry, 31 
Proportion, 238 

standard error of simple samplmg 
of, 238-239 

of stratified samphng of, 239 
Proportional sample, 230 
Proportions, 266 

sigmficance of the difference be- 
tween two, 266-268 
Punching machme, 49 

Q 

Q, Yule’s coefiBcient of correlation 
for fourfold tables, 210 
Q scores, 137 
Quahtative data, 197 
Quality, 4 

Quantifilcation of social data, 10-23 
Quantity, 4 

Quartile deviation, 135-136 
formula for, 136 
Quartiles, 131-134, 136-137 
Questionnaire, 37 
Quetelet, 3 

R 

Random, 224 
Random sample, 224-225 



350 


ELBUENTARY SOCIAL STATISTICS 


Random sampling numbers, 226-* 
228 

Randomization, principle of, 27-28 
Range, 60 

Rank correlation, 191-192 
formula for, 191 
m time senes analysis, 287 
Ranking, 11 
Rates, 109-110, 113 
Rating, 11, 45 
Ratio, 53, 109 
Recurrent universe, 222 
Regression coefficient, Imear cor- 
relation, 180, 190 

Regression equations, Imear cor- 
relation, 175-176 

error m predictmg a mean vs. 

individual values, 250 
formula for, when r is known 
184r-185 

formulas for, 175, 176 
geometric meanmg of, 175 
goodness of fit and standard error 
of estimate, 177-180 
normal equations, 175-176 
use of, for prediction, 179 
Relationship (gross) between two 
factors nonquantitative cor- 
relation, 197 

{See also Correlation, nonquan- 
titative) 

Relationship (gross) between two 
factors simple hnear quanti- 
tative correlation, 171-196 
{See also Correlation, Imear) 
Rehabihty, 20, 42-43 
Repeated trials, 151 
Replication, 27 
Representative data, 6 
sample, 246 

Representativeness of an average, 
108, 130-131 
of a sample, 250-252 
Rice, Stuart A , 9 
Richardson, C. H., 18, 170 
Rider, P R , 148 


Root, mean square-, deviation, 124- 
129 

{See also Deviation, standard) 
“Roundmg off,” 53 
R ulin g of a frequency table, 72 

s 

Sample, 6 
Bernoulli, 224 
large, 234 
Lexis, 231 

Poisson (stratified), 224, 230-231, 
234 

proportional, 230 
random, 224r-228, 255-256 
representative, 246, 250-252 
simple, 22^226, 229-231, 234, 256 
size of, in relation to standard 
error, 234, 237 

stratified (Poisson), 224, 230-231, 
234 

takmg the, 224-232 
Samplmg, 221, 224r-232 
confidence (fiducial) hmits, 248 
by groups of events, 228-229 
general theory of, 232-234 
random samplmg numbers, 226- 
228 

unit of, 243 

Samplmg differences, 255-275 
{See also Significance) 

Samplmg distribution, 232 
Samplmg errors, 234 
simple samplmg errors apphed to 
random and stratified sam- 
ples, 234 

{See also Standard error) 

Scale, the, 14 

Chapm’s socioeconomic, 12, 20 
graphic ratmg, 14 
Thurstone’s attitude, 15-17 
Scates, Douglas, 23, 30 
Scatter diagram, 171-174, 188 
Schedule, editmg, 48 
the statistical, 37-40 
testmg, 42-47 
Scores, 12 
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Scoring, 12 

Seasonal fluctuations in time senes, 
28S-295 

Second moment, 165 
Secondary statistical data, 33-36 
Secular tread, 277-283 
Semi-mterquartile range, 135-136 
(See also Quartile deviation) 
Semilogarithmic paper, 84r-85, 88 
Sheppard^s correction, 128 
Sigma, S, <r, 99, 124 
Significance of a correlation coeffi- 
cient, 257-258 

of the difference between any two 
correlated statistics, 259-260 
of the difference between any two 
mdependent statistics, 260 
of the difference between the 

combmed mean of two simple 
samples from the same uni- 
verse and the mean of either 
one of the samples, 266 
of the difference between the 

means of two samples sup- 

posed to be simple samples 
from the same universe, 264r- 

265 

of the difference between the 

means of two simple samples 
from different universes, 265- 

266 

of the difference between statis- 
tics from more than two 
samples, 272-273 
of the difference between two cor- 
related means, 261-263 
of the difference between two cor- 
relation coefficients, 268-269 
of the difference between two 
independent means, 263-264 
of the difference between two or 
more frequency distnbutions, 
269-272 

of the difference between two 
proportions, 265-268 
of gi and g^, 258-259 
levels of, 256-259 
meanmg of tests of, 255-257 


Significance of sampling differences, 
255-275 
of a sum, 269 

Significant figures, number of, 53 
Simple sample, 224-226, 229-231 
error of samplmg apphed to 
random and stratified sam- 
ples, 234 

Simple samplmg, 269 
test of the hypothesis of, 269 
Simphcity the statistical ideal, 8 
Size of sample, 234, 237, 246-249 
Skewed frequency distnbution, 107 
bmomial, 156 
formulas for, 164, 165 
geometric mean of, 110 
graphs of, 107, 164 
meanmg of the standard deviation 
or standard error of, 161 
representativeness of averages of, 
107, 108 

table showing a, 164 

(See also gij index of skewness) 
Slope of hne, 175 
Smith, James G , 9, 170 
Smoothed frequencies or curve, 79 
Snedecor, G W , 29 
Social sciences, 4, 5, 6 
Social statistics, 3 

Socioeconomic status, Chapm^s scale 
for measurmg, 12 
Sociological journals, 32 
Sorenson, H , 75, 142 
Sortmg machine, electric, 49-50 
Squares and square roots, 309-322 
Standard deviation, cr, 124-129 
(See also Deviation, standard) 
Standard error, 161 
of arithmetic mean, 239-240 
controlled by size of sample, 
246-249 

corrected for limited umverse, 
242-243 

of a frequency, 235-237 
of a population rate, 244-246 
m predicting a mean vs mdividual 
values from a regression equa- 
tion, 250 
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Standard error of a proportion, 238- 
239 

of biserial r, 217 

of coefficient of contingency, C, 
217 

empirical, 232 
of standard deviation, 241 

stratified or Poisson sampling, 
of arithmetic mean, 240 
of a frequency, 237 
of a proportion, 239 
of tetrachoric r, 217 
theoretical, 232 

when umt of samphng is a group 
of events or a district, 243- 
244, 246 
of Yule’s Q, 217 

Standard error of estimate, hnear 
correlation, 177-180 
formulas for, 178, 190 
meaning of, 179 
Standard scores, 136 
Stanford-Binet mtelligence test, 12, 
19 

Statistic, definition of, 221 
true, 136 

Statistics, and the individual, 7 
the method of probabilities, 4 
origms of, 3 
social, 3 

Statistics not a mechanical method, 
8 

Steepness of a hne graph, meaning 
of, 116 

Straight-lme relationship, 19, 277- 
281, 291 

Stratified sample, 224 
errors of simple samphng apphed 
to, 234 
umverse, 246 

Stub of a frequency table, 71 
Success, ^ e , successful event, 149, 
222 

Sum, significance of a, 269 
Summation, 99 

Symmetrical frequency distribution, 
graph of, 106 


Symmetrical frequency distribution, 
representativeness of average 
of, 106-108 
Symonds, P M , 18 

T 

Tables, caption, 71 
rules of form for frequency, 71-72 
rulmg, 72 
statistical, 41-42 
stub, 71 
title, 71 

Tabulation of frequency distnbu- 
tions, hand methods, 59-75 
of statistical data, mechamcal 
methods, 48-50 

Tabulating machme, electric, 50 
Talljong, 60 

Terman, Lewis M,, and Maud A. 

Merrill, 23 
Test, final, 28 

Tetrachoric correlation, 211-217 
computmg diagrams for, 215-217 
formulas for, 212 
standard error of, 217 
Theoretical standard error, 232 
Thermometer, 18, 19 
Third moment, 165 
Thorndike, E L , 45 
Three-dimension graphs, 89 
Thurstone, L , attitude scale, 
15-17 

computmg diagrams for the tetra- 
choric correlation coefficient, 
215-216 

“Fundamentals of Statistics,” 196 
Time series, analysis, 276 

correlation between short-term 
cycles of two time series, 
286-288 

graphs of, 82-87, 277 
seasonal fluctuations, 288-296 
secular trend, a moving average, 
281-283 

straight Ime, 277-281 
short-term cycles, 283-286 
freed from seasonal fluctuations, 
295-296 
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Tippett, L H C , 170 

random samplmg numbers, 226- 
228, 254 

Title of a frequency table, 71 
Transcription sheet, statistical, 41 
Treloar, A E , 148, 170, 254 
Trend, 277 
secular, 277-283 

U 

Unit of samplmg, 221, 243-244, 246 
Umts, equahty of, m social measure- 
ment, 12, 14, 15, 18, 19, 21, 59 
Umverse, 136, 221 
binomial, 233 
changmg, 222 
existent, 222, 226 
heterogeneous, 223, 229, 246 
homogeneous, 223, 229, 234 
hypothetical, 222, 225-226, 229- 
230, 234 

infinite, 222, 228-229 
limited, 222, 226, 228, 242-243 
mixed, 246 
recurrent, 222 
stratified, 246 
unique, 222 
Unordered data, 10 

V 

Vahdity, 20, 42-46 

Van Voorhis, W R , and C. C. 

Peters, 30, 207, 217, 220, 254 
Variable, 62, 231 
continuous, 28, 62 
discrete, 62 

Variables, mterfering, 28 
Variance, 128 


Variation, coefficient of, 129-131 
for comparmg variation, 131 
formulas of, 130 

as a measure of the representa- 
tiveness of an average, 130- 
131 

need for, 129-130 

W 

Walker, Helen M , 9 
and W. N Durost, 75 
Waugh, A E , 297 
Weighted arithmetic mean, 63, 104 
Weightmg, 12 
Whelpton, P. K , 32 
White, R. C , 75, 142, 196, 297 
Wolf, A, 30 

X 

X axis, 77 
X* (see Chi-square) 

Y 

Y axis, 77 

Y-intercept, 175, 190 
Young, Pauline V , 55 
Yule, G. U , and M G Kendall, 24, 
75, 121, 170, 175, 182, 196, 
220, 254 

Yule's Q, coefficient of correlation 
for fourfold tables, 210 
standard error of, 217 

Z 

Z, values of, for given values of r, 
307-308 

Zero point on scale, 12, 14, 15, 19, 22 



